sparsemat / sprs

sparse linear algebra library for rust
Apache License 2.0
381 stars 45 forks source link

Relax copy bound #276

Closed vbarrielle closed 3 years ago

vbarrielle commented 3 years ago

This pull request removes the Copy bound from most functions and methods, fixing #121.

While removing the bound, I tried to ensure there were not too many unnecessary calls to Clone, with the idea that this should enable storing more complex data as our scalar type (such as big ints, or heap-allocated matrices to have sparse by blocks matrices) while remaining performant.

The main way the unnecessary clones have been removed is through the introduction of the MulAcc trait, which represents the fundamental operation for matrix products. Since it takes its arguments by reference, it should be possible to implement this trait without unnecessary allocations for heab-based scalar types. For convenience, this trait is automatically implemented for all Copy types that also implement num_traits::MulAdd. This ensures all common scalar types work, and should also help the compiler using the fused-multiply-add instruction when available.

Another pattern that was useful to avoid unnecessary clones was adding bounds such as for <'r> &'r N: std::ops::Add<&'r N, Output=N>. These bounds exist for all classic scalar types. However, it can be hard to use them in traits, as evidenced by https://github.com/rust-lang/rust/issues/82779. Because of this issue, implementations of std::ops::{Add,Sub} for CsVecBase were forced to use one Clone.

I have not relaxed the Copy bound in sprs-ldl yet, I think I would need to add a trait such as MulSubAcc similar to MulAcc that could be used in solvers. Otherwise I would need to sprinkle the code with calls to clone() everywhere.

@mulimoen as usual I'd be very interested in your feedback.

vbarrielle commented 3 years ago

Thanks for the feedback @mulimoen ! I've added a Kronecker trait, and included your sparse matrices per block as a test. With a single change from your implementation: I inlined the product implementation in the MulAcc impl, to demonstrate that it can be used to remove temporaries.

vbarrielle commented 3 years ago

Thanks for the review @mulimoen!