bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
209 stars 25 forks source link

Use slice in packing function for noalias optimization #74

Closed bluss closed 1 year ago

bluss commented 1 year ago

Using a reference type (such as a slice) for either pack or a in the packing function makes rustc emit a noalias annotation for that pointer, and that helps the optimizer in some cases.

What we want is that the compiler sees that the pointers pack and a and pointers derived from them, can never alias, then it has more freedom to rewrite the operations in the packing loops. The pack buffer is contiguous so it's the only choice for passing one of the two arguments as a slice.

Shown to slightly speed up the layout benchmarks for sgemm, not dgemm, on M1. No effect noticed on x86-64.

A way to get the same effect without a slice would be good for this crate, like a 'restrict' keyword.