bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
209 stars 25 forks source link

SNB Performance #8

Closed millardjn closed 7 years ago

millardjn commented 8 years ago

Hi Bluss,

I've fiddled with library a bit and managed to get a ~25% performance boost on sandy bridge. I had to rearrange things a bit to get it to work (llvm is truly capricious) so I thought I'd see if it works for other setups before sending a PR.

I was also thinking that the ~b packing could be combined into a single step with im2col for reasonably fast low memory convolutions. I might give it a go some time soon along with rayon multithreading. Any thoughts on the intended scope for the library?

bluss commented 8 years ago

Hi, I'm very interested in seeing your code either way. I use SNB too, so I can't give you much more testing.

Threading is in scope for this library, but I'm unsure if rayon is a good fit. We're in a good situation to simply set up a fixed number of threads.

bluss commented 7 years ago

Closing since this is inactive (an inactivity that was started by me, I'm sorry!)