JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.4k stars 5.46k forks source link

Allow 0-based or 1-based integer fields in SparseMatrixCSC #2624

Closed dmbates closed 11 years ago

dmbates commented 11 years ago

The SparseMatrixCSC type has two integer fields, colptr and rowval that are used to determine positions in the matrix. In Julia these are 1-based. Usually in C libraries these are 0-based. It is easy to distinguish 0-based and 1-based vectors in these positions as the base is colptr[1].

There are functions decrement, decrement!, increment and increment! defined in base/linalg/umfpack.jl to switch back and forth between 1-based and 0-based.

I am not proposing changing any user-facing behavior (i.e. A[i,j], etc would always use 1-based indices), just allowing for more flexibility in the internal representation. Most of the conversion code in base/linalg/umfpack.jl and base/linalg/chomod.jl already checks a SparseMatrixCSC object to see if it is 1-based or 0-based. It would be non-trivial to change all the code in base/sparse.jl to allow for 0-based integer vector fields but it could be done.

mlubin commented 11 years ago

For me, colptr is user-facing. What use case do you see for this? There's nothing wrong with using zero-based indices internally (i.e. inside a module), but I think that any matrices given to a user should follow a single convention.

lindahua commented 11 years ago

I understand the performance issue here. A lot of C libraries that work with sparse matrices use zero-based indices, and the conversion across C and Julia would incur O(nnz) overhead.

But I guess quite a lot of things have already been depending on the one-based behaviour. This change may involve revising the implementation of many functions dealing with sparse matrices (e.g. the implementation of sparse matrix addition would have to take into account adding a zero-based and a one-based matrix). For packages, MATLAB.jl and Gurobi.jl have already been depending on this (as it is documented on the Julia manual).

I don't mind making changes to my codes. Just that we should weigh the pros and cons very carefully before making a decision here.

mlubin commented 11 years ago

There are also a fair number of state-of-the-art sparse linear algebra libraries (PARDISO, HSL library from NAG) that are written in Fortran and use 1-based indices.

It would be ok with me if umfpack and cholmod accept both 0-based and 1-based indices in a well-documented manner; this way advanced users can avoid the O(nnz) cost if it's really an issue (though I doubt this is often the case, since the factorization itself is more than O(nnz) and the indices are only scanned once if you do multiple numerical factorizations.) I don't really support requiring all code that interacts with colptr to handle both 0-based and 1-based indices.

dmbates commented 11 years ago

I agree with the arguments and will close this issue.

ViralBShah commented 11 years ago

This party started and got over before I could join. I have mulled over this issue for a while when I first wrote SparseMatrixCSC, but figured that one of the cool things about julia is that people can dig into underlying data structures and write algorithms in julia itself, which is why I stuck to 1-based indexing. The cost of increment and decrement is negligible when solving a sparse system, and hence 1-based indexing felt right.