sparsemat / sprs

sparse linear algebra library for rust
Apache License 2.0
422 stars 47 forks source link

allow fully parameterized seperate types for matrix elements,vector elements,their products and accumulators #283

Open experiment9123 opened 3 years ago

experiment9123 commented 3 years ago

Apologies if this is already supported (or considered and rejected) for AI mixed precision matrices are increasingly common - for example 8bit weights, relying on 32bit accumulation

one way to support this might be this kind of signature:

fn matrix_mul(  Matrix<A>,  Vector<B>)-> Vector<Accumulator>
  where   A:Mul<B, Output=Product>
              Product:Add<Output=Accumulator>
             Accumulator:AddAssign<Product> + Default

it may however be even more intuitive to rely on the 'accumulator' being reverse infered , and rely on it supporing "+=" for products. this would support all eventualities, e.g. 16bit vector values, 8bit matrix weights -> 24 bit products(in i32), but so many components that 64bit accumulators are needed.

I wasn't sure if the library already supports this but I see the trait MulAcc {}which I presume is used for the evaluation, which seems to imply the same type must be used for all steps of the process.

Perhaps the type relationships here could be retrofitted to an extension of this trait?

This would also allow for dimension checked types, e.g. physical units for the A,B, Outputs being seperate. However merely considering that will miss the potential for product / accumulator precision

Furthermore putting non numeric types into these would allow using the sparse matrix ops as a way of performing graph operations (for example the "Accumulator" could be a collection that takes information from edge links described by a sparse matrix)

in this toy example i'm building i'm doing this sort of thing - here 'conways game of life' is implemented on a 'graph type' built from a vector & SparseMatrix holding the edges. (the edge matrix 'value' is empty (), the 'product' & accumulators are u32, the vector type is a generic 'Cell', the output of the matrix multiply is a "number of neighbours" and another function is used to update the cells using that) https://github.com/experiment9123/graphengine/blob/master/rustgraph/examples/gol.rs https://github.com/experiment9123/graphengine/blob/master/rustgraph/src/lib.rs

the eventual intention here is to use this framework for spiking neural net experiments

mulimoen commented 3 years ago

The MulAcc trait was added to avoid a Copy, and not for accumulating into another type, but this would be a very interesting extension. As far as I can tell, there is no reason why this has not been tried apart from a lack of time and interest.

If you wish to implement this, the easiest way is to copy mul_acc_mat_vec_csr in src/spase/prod.rs, and tweak the bound on N to instead rely on a combination of the num_traits traits. I see uom is using num_traits, so the physical units should be covered by such an approach.

experiment9123 commented 3 years ago

https://docs.rs/sprs/0.10.0/sprs/trait.MulAcc.html

now examining more closely I see that you really do rely on Fused multiply add. it might take a bit more thought for me how this will work. I'm aware that fused multiply add has different precision and hardware support in many machines, so is important for people to be able to express explicitely

I think it might be possible even in my experiment and still allowing everything as generic as I have in mind.. but I should confirm this before trying anything here .. i'll end up with something like this MatElem:MulAdd<VecElem,Acc,Output=Acc>

I'll mention that I've also been using references in the computation steps (wheras in most graphics-style vec/matrix libs i've used copy types) - again this opens the door for slotting in non trivial types (bignums ?..)

mulimoen commented 3 years ago

T: MulAdd<VecElem,Acc,Output=Acc>

This trait bound looks right, although care must be taken so it does not involve a Copy. At the current state, it should be possible to use bignums, we have an example in tests/block_matrix.rs to illustrate this using a weird inner type (a matrix).

experiment9123 commented 3 years ago

thinking out loud , what if the library was split into some layers ..is it possible to achieve code re-use with easy experimentation (without making changes to a larger library with little overlap in my rather tangential requirements).. without creating dependancy hell with many smaller libraries. It may be the case in the end I should just create the thing on the right as a completely seperate experiment, and just refer to your crate alot to make it fit the same patterns so it's familiar to users.. but it would be great if eventually this could all go into one all powerful rust sparse matrix lib

I've already benefited from reviewing your code here.. (discoverd the established CSR/CCR terms for things I was re-inventing, and this MulAdd trait does actually streamline some of the generic impls)

Besides all that (grand plan) .. I may be curious to submit a PR fully genericising MulAdd

IMG_3831

Back in C++ i'm just experimenting with Eigen ( a very comprehensive library) .. it can get as far as 3 types (a Matrix element, a Vector element , and a seperate Output type computed from MatrixElemVecElem->ResultElem) .. but can't quite seem to do the full 4 types* I have in mind. Retrofitting changes to Eigen would be downright impossible as it's huge and more internally complex, and has way more users riding on it

experiment9123 commented 3 years ago

If you wish to implement this, the easiest way is to copy mul_acc_mat_vec_csr in src/spase/prod.rs, and tweak the bound

1st small experiment in this direction:

pub trait MulAcc<Src1=Self,Src2=Self> {
    /// Multiply and accumulate in this variable, formally `*self += a * b`.
    fn mul_acc(&mut self, a: &Src1, b: &Src2);
}

https://github.com/experiment9123/sprs/blob/master/src/mul_acc.rs

so,this defaults to Src1,Src2=Self, and the "default impl" just does impl<N> MulAcc<N,N> for N , allowing f32/f64 to work unchanged. I can still run the examples with this change.

I think we could gradually work through the functions and make each function using a MulAcc work generically although my 1st attempt (pub fn csvec_dot_by_binary_search<N, I,A,B>(vec1<A>,vec2<B>) -> N ) shows me the internal changes needed are non-trivial (this impl relies on swapping vars from refs in either side, I have to think how to do that cleanly without a nasty cut-paste)

I've changed this function to handle it, see the example .. this is a taste of the kind of refactoring that might crop up to retrofit this change. the good new is (i)it seems to be possible incrementally(ii)The MulAcc trait itself really helps this refactor. the bad news is "non trivial changes may be needed", hence needing review for mathematical correctness

It will have to be a rolling refactor so you will need to approve every step of the process rather than waiting for one big PR that enables this support


pub fn csvec_dot_by_binary_search<N, I,A,B>(
    vec1: CsVecViewI<A, I>,
    vec2: CsVecViewI<B, I>,
) -> N
where
    I: SpIndex,
    N: crate::MulAcc<A,B>+crate::MulAcc<B,A> + num_traits::Zero,
{
    //Check vec1.nnz<vec2.nnz ; call self with args flipped if not
    // relies on dot-product (and elem ops) being commutative
    // can't work with the original method with seperate A,B types
    // because the val1 / val2 would be swapped from &[A]<->&[B]
    if vec1.nnz() < vec2.nnz(){
        return csvec_dot_by_binary_search(vec2,vec1);
    } 
    // vec1.nnz is larger
    let (mut idx1, mut val1, mut idx2, mut val2) =
        (vec2.indices(), vec2.data(), vec1.indices(), vec1.data());

further investigation -

"mat(A) * mat(B) -> vec(C)"

mulimoen commented 3 years ago

Really great to see this can be done incrementally, that makes reviews a lot easier!

further investigation - "mat(A) * mat(B) -> vec(C)" I'm afraid this can not be done without knowing the two matrices are compatible on the type level.