JuliaSparse / SparseArrays.jl

SparseArrays.jl is a Julia stdlib
https://sparsearrays.juliasparse.org/
Other
89 stars 51 forks source link

Sparse matrix format interfaces #538

Open termi-official opened 3 months ago

termi-official commented 3 months ago

Following a discussion on slack about sparse matrix format interfaces (https://julialang.slack.com/archives/C6A044SQH/p1715862033197389) I would like to bring up the discussion again here on GitHub, also for future reference on the topic.

Problem

In SparseArrays.jl we currently have the types AbstractSparseMatrix and AbstractSparseMatrixCSC. However, we do not have abstract types for other matrix formats, e.g. probably the most common would be AbstratSparseMatrixCSR. Just to make sure I am not misunderstood, I am not arguing that SparseArrays.jl should have a concrete version of SparseMatrixCSR and im am totally fine that there is none in SparseArrays.jl. I am really just concerned about interfaces. IMO having some abstract types and a set of function as an interface for the most common matrix formats can be beneficial. My main concern here is interfacing with non-julia libraries providing CSR matrices (e.g. Ginkgo, CUSPARSE, rocSPARSE, ...).

The for having an interface is that framework developers can write their algorithms generically per sparse matrix format directly.

Solution Proposal

I think a starting point could be to have a generic interface to CSR matrices in SparseArrays.jl directly.

I am thinking about the follwing, symmetric to AbstractSparseMatrixCSC

Alternatively I had the idea that we might be able to move the abstract portion of SparseArrays.jl into a separate package SparseArrayInterfaces.jl, so downstream package developers can decide whether they need the full SparseArrays.jl infrastructure or just the interface.

Alternative Solutions

The obvious alternative is that framework developers could just define internal trait systems and control their dispatches with the traits. However, I am not experienced enough to see what the potential impact of this on the package ecosystem could be (positive or negative).

In the slack discussion @rayegun brought up another very interesting alternative to having different interface to different matrix formats. IIUC the idea is to provide a generic iterator interface on sparse matrices which should be used in the algorithms (xref https://github.com/JuliaSparse/SparseArrays.jl/pull/167 ) or generating the code for sparse matrices in one way or another (e.g. via Finch.jl). I really like these ideas from a software development point of view, because it allows properly decoupling the algorithms on sparse matrices from the specific data structure. For the iterator interface however, I am not sure about its limitations.

Feel free to convert the issue to a discussion if it is more appropriate. I just opened it as an issue, because from my experience they have better visibility than discussion threads.

X-Refs

Some more discussion on SparseMatrixCSR vs SparseMatrixCSC.

https://discourse.julialang.org/t/csc-kills-the-prospect-of-multithreading-shouldnt-julia-use-csr/102491/ https://discourse.julialang.org/t/what-to-do-since-csr-matrixes-are-not-in-sparse-arrays/53991 https://github.com/JuliaSparse/SparseArrays.jl/issues/41 https://github.com/JuliaSparse/SparseArrays.jl/issues/520

j-fu commented 3 months ago

Just FYI: there is also https://github.com/gridap/SparseMatricesCSR.jl

termi-official commented 3 months ago

Thanks Jürgen. Let me then also mention for completeness of the thread the packages https://github.com/BacAmorim/ThreadedSparseCSR.jl/ https://github.com/youwuyou/Ginkgo.jl https://github.com/fredrikekre/HYPRE.jl providing CSR formats

ViralBShah commented 3 months ago

The problem right now is that SparseArrays.jl is a Julia stdlib, and I believe it can't be removed. Something to do with type piracy of * and . @KristofferC tried to excise it and realized we couldn't.

I would personally prefer to get it excised and also LinearAlgebra, and I also like the idea of the Interfaces package. In the meanwhile we can certainly add the types and machinery you suggest so that they can be extended by downstream packages.

Would be good to have @SobhanMP and @rayegun chime in as well, before any PRs are put together.

termi-official commented 2 months ago

Thanks for the additional context @ViralBShah . I went ahead and drafted what I have in mind and could need some feedback. The alternative would be to move the CSR code into a different package for now (SparseMatricesCSR.jl, if it is still maintained?) and strip the open PR down to the changes which are necessary to enable the AbstractSparseMatrixCSR interface.

Edit: If there is a third option I am still happy to hear. (And no worries, I have not forgot about the sparse matrix iterator interface tho Raye)

SobhanMP commented 1 month ago

forgot about the sparse matrix iterator interface tho Raye

Is this something different than #167?

termi-official commented 1 month ago

From what I understood Raye's idea was to have a generic interface which works across different formats. The PR looks like it is tailored to provide iterators primarily with CSC in mind

@inline getcolptr(x::SparseIndexIterate) = getcolptr(x.m)
@inline getrowval(x::SparseIndexIterate) = getrowval(x.m)
@inline getnzval(x::SparseIndexIterate) = getnzval(x.m)

right?

SobhanMP commented 1 month ago

The interface is generic; I only added CSC and dense matrix support because that's what's implemented in this package. Nothing is really stopping you from adding CSR support, and it should be composable with other types. https://sobhanmp.github.io/SparseExtra.jl/iternz/

j-fu commented 3 weeks ago

In the moment, AbstractSparseMatrixCSC is not exported though it is used quite a bit, see #395 . This gives the possibility to develop a comprehensive approach, which might include #167 for an API for AbstractSparseMatrix.

As different sparse solver libraries use either hard coded CSC or CSR, IMHO the interfaces should provide the corresponding vectors (even if the need to be produced by a particular imlementation).

LinearSolve.jl uses size, nonzeros, rowvals, getcolptr. Accordingly I would add nonzeros, getrowptr to an AbstractSparseMatrixCSR interface.

termi-official commented 2 weeks ago

Thanks for the feedback Jürgen. Please note that I have put together a draft including these points here https://github.com/JuliaSparse/SparseArrays.jl/pull/546