sparsemat / sprs

sparse linear algebra library for rust
Apache License 2.0
386 stars 45 forks source link

Safety implications of unsorted indices #232

Open mulimoen opened 3 years ago

mulimoen commented 3 years ago

This is a discussion issue for how to handle unsorted indices in arrays and matrices. The situation as it is today is relaxed, we try to avoid it, having unsorted indices won't cause safety issues, but might produce unexpected results/panics in some algorithms.

The primary motivation for bringing this up is to allow/disallow future optimisations to take advantage of the sorted property, and how the library should behave when these invariants are broken.

Should we formalise this behaviour and document it or should we disallow creation of unsorted structures?

Ref #231, #35, #16, #136

vbarrielle commented 3 years ago

As you pointed out, there are two separate concerns here:

Here are the possible ways forward in my opinion, including the options you mentioned:

Do you see other alternatives?

mulimoen commented 3 years ago

The usage of unsafe in this crate is minimal, which is great for memory safety. We should not have problems with this, as long as all indices are in bounds when passing to external bindings.

The sorted indices are very nice for so many algorithms, dot-product, kronecker product, arrays from a matrix slice*, more efficient use of cache.

I think option 2 would be preferable, as most algorithms that works on sorted indices will probably also produce sorted indices. We can avoid the runtime checks by allowing an unsafe constructor. This would also be the least amount of breakage and API change as you mention.

In #35 you mention the need for unsorted indices. Has this changed?

[*]: If we introduce a new slicetype with an offset field

vbarrielle commented 3 years ago

The sorted indices are very nice for so many algorithms, dot-product, kronecker product, arrays from a matrix slice*, more efficient use of cache. I think option 2 would be preferable, as most algorithms that works on sorted indices will probably also produce sorted indices. We can avoid the runtime checks by allowing an unsafe constructor. This would also be the least amount of breakage and API change as you mention.

Yes I think I agree, the sorted indices is a very convenient property in many algorithms. The unsafe constructor is the way to go.

In #35 you mention the need for unsorted indices. Has this changed?

There are some decompositions that can only produce unsorted indices as far as I know. But the case is quite rare, and they don't need that many operations, so when these decompositions get implemented we can define a limited sparse matrix type with unsorted indices to help in these decompositions.

maboesanman commented 3 years ago

It seems to me like some algorithms requiring sorted indices and some not could be handled with a private is_sorted Field, which is assumed to be correctly set. When creating a matrix from raw, there could be three options:

Additionally, a method to sort if needed would be useful. Possibly sorted_view And sort Methods?

vbarrielle commented 3 years ago

Just for reference the C++ lib eigen has similar questions: https://gitlab.com/libeigen/eigen/-/issues/694#note_254554694