bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
209 stars 25 forks source link

Updated comment in function kernel_x86_avx #68

Closed Tastaturtaste closed 2 years ago

Tastaturtaste commented 2 years ago

While looking through the code I noticed that a comment describing why certain permutations are done a certain way with intrinsics did not reflect what was actually done in the code. Specifically the permutation set mentioned in the comment was {0123, 1032, 2301, 3012} while the actual permutations are {0123, 1032, 3210, 2301}.

In addition the comment mentioned alternative selections that were partly not correct with respect to what the code was actually doing. I updated them and verified the correctness according to the intel docs as well as tests on godbolt.

bluss commented 2 years ago

thanks!