Closed vbarrielle closed 3 years ago
Thanks for the feedback @mulimoen ! I've added a Kronecker
trait, and included your sparse matrices per block as a test. With a single change from your implementation: I inlined the product implementation in the MulAcc
impl, to demonstrate that it can be used to remove temporaries.
Thanks for the review @mulimoen!
This pull request removes the
Copy
bound from most functions and methods, fixing #121.While removing the bound, I tried to ensure there were not too many unnecessary calls to
Clone
, with the idea that this should enable storing more complex data as our scalar type (such as big ints, or heap-allocated matrices to have sparse by blocks matrices) while remaining performant.The main way the unnecessary clones have been removed is through the introduction of the
MulAcc
trait, which represents the fundamental operation for matrix products. Since it takes its arguments by reference, it should be possible to implement this trait without unnecessary allocations for heab-based scalar types. For convenience, this trait is automatically implemented for allCopy
types that also implementnum_traits::MulAdd
. This ensures all common scalar types work, and should also help the compiler using the fused-multiply-add instruction when available.Another pattern that was useful to avoid unnecessary clones was adding bounds such as
for <'r> &'r N: std::ops::Add<&'r N, Output=N>
. These bounds exist for all classic scalar types. However, it can be hard to use them in traits, as evidenced by https://github.com/rust-lang/rust/issues/82779. Because of this issue, implementations ofstd::ops::{Add,Sub}
forCsVecBase
were forced to use oneClone
.I have not relaxed the
Copy
bound insprs-ldl
yet, I think I would need to add a trait such asMulSubAcc
similar toMulAcc
that could be used in solvers. Otherwise I would need to sprinkle the code with calls toclone()
everywhere.@mulimoen as usual I'd be very interested in your feedback.