Open Ivorforce opened 2 months ago
I implemented an axes supporting reduce_dot
simply using a caching multiply and sum.
It's not the best implementation, but it's short and uses almost no binary size (plus it's consistently a bit faster than using the nd.
API for the calls separately).
With xtensor-blas, we can still accelerate when available, but this is a good start.
Xtensor has support for linear algebra functions through xtensor-blas.
Unfortunately, this requires and LAPack binaries. Accordingly, it should probably only be part of a "wide scope" download.
Another problem is that (as they say) broadcasting is not fully supported for most of them yet. But I suppose that's ok, it will crash and people can use the broadcast function (#14). Better yet, we check and broadcast ourselves in NumDot if needed.