JuliaSparse / SparseArrays.jl

SparseArrays.jl is a Julia stdlib
https://sparsearrays.juliasparse.org/
Other
90 stars 52 forks source link

Support zero-based indices #522

Closed orenbenkiki closed 5 months ago

orenbenkiki commented 6 months ago

I'm trying to share data between Python and Julia. I can do zero-copy sharing of dense arrays w/o a problem. However, Julia uses 1-based indices and Python uses 0-based indices. This means that the 0-based indptr and indices Python arrays are incompatible with the Julia colptr and rowval 1-based arrays, forcing a creation of a copy every time data passes between the languages.

SparseBlas data has a field which specifies whether the indices arrays are 0-based or 1-based. Can SparseArrays be extended to support this as well? This has basically no impact on efficiency (an additional delta added to offset computations, without the need for any branches).

I'm also opening an issue in Python's scipy library asking for the same feature (https://github.com/scipy/scipy/issues/20306). However until at least one of the libraries supports this feature, I'm forced to copy the data (and some of my sparse matrices are very large - 100s of MBs...).

rayegun commented 6 months ago

So the current design of SparseArrays.jl makes this difficult. The easiest solution is to use CIndices.jl as your index vectors. This means when you unsafe_wrap you should do so with element type CIndex{Int64}. Reinterpreting is expensive so try to avoid that.

The implementation of CIndices is incomplete, it probably needs extensions added for the solvers in SparseArrays.jl, and potentially other libraries. This might be necessary to add to the python interop libraries as well as an optional conversion.

Performance without reinterpreting should be pretty good, with reinterpret it will be pretty bad.

orenbenkiki commented 6 months ago

Thanks for the pointer to CIndices.jl - can you say something in what sense it is incomplete? That is, what operations you expect may fail when using it?

rayegun commented 6 months ago

I don't have any specifics now. I just assume there are some missing operations because it was built piecemeal for this purpose. It is also possible there are functions (like solvers) that convert to Int64 and Int32 that need to be overloaded somehow. KLU.jl should support it now (although I'm not sure if I've tagged the release), I will continue working on the others.