rust-ndarray / ndarray

ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations
https://docs.rs/ndarray/
Apache License 2.0
3.43k stars 295 forks source link

Add support for BLAS Syrk #1358

Open tzachar opened 5 months ago

tzachar commented 5 months ago

Hi.

Thank you guys for this awesome library.

In my application, I need to compute a gram matrix (basically, x.t().dot(x)). Using gemm is wasteful, as the result is symmetric and the lower half is redundant, so using syrk in this case is twice as fast as gemm.

On a general note, it would be really useful if the library could also easily support calling any general BLAS / LAPACK function using appropriate idioms (like getting memory layout, strides, etc.)

nilgoyette commented 5 months ago

This has already been discussed in this issue.

You probably already know but general_mat_mul is "hidden" in linalg module. However this only offers gemm, which is not what you're asking for.

tzachar commented 5 months ago

I am aware of general_mat_mul. Did anyone ever end up writing an additional crate for BLAS over ndarray?

nilgoyette commented 5 months ago

Afaik, nobody did. And I don't think anyone will do it in the near future. ndarray hasn't evolved much in the last 4-5 years because of a lack of maintainers.

Pencilcaseman commented 3 months ago

I'd be interested in improving the BLAS functionality. Having spent a lot of time messing around with multidimensional array libraries myself (sole developer of LibRapid), I've got a decent understanding of how best to implement things.

As an experiment, I wrote a BLAS wrapper for Rust, which wraps a system BLAS library in a C API, which is further wrapped in a Rust API. It supports any BLAS library that has a CBLAS interface.

Implementing something like this (or using this directly??) could be interesting. Is that something worth looking into?

On an unrelated note, I'd also be interested in getting OpenCL and CUDA support added to this library -- an obvious first step would be BLAS with CLBlast and cuBLAS, but JIT-compiled kernels could be interesting, too.

nilgoyette commented 3 months ago

This is something that @bluss might be interested in.