bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
213 stars 25 forks source link

Use CARGO_CFG_TARGET_FEATURE to pick sgemm 8x8 if avx exists #17

Closed bluss closed 7 years ago

bluss commented 7 years ago

In sgemm (f32) use 8x8 kernel if AVX is enabled. (This is open to more platform specific tuning).

Improvement from sgemm 4x8 to 8x8 with avx:

 name                     old-f32 ns/iter  new-f32 ns/iter  diff ns/iter   diff % 
 mat_mul_f32::m004        110              103                        -7   -6.36% 
 mat_mul_f32::m005        162              117                       -45  -27.78% 
 mat_mul_f32::m006        170              136                       -34  -20.00% 
 mat_mul_f32::m007        191              148                       -43  -22.51% 
 mat_mul_f32::m008        211              163                       -48  -22.75% 
 mat_mul_f32::m009        371              346                       -25   -6.74% 
 mat_mul_f32::m012        484              461                       -23   -4.75% 
 mat_mul_f32::m016        702              605                       -97  -13.82% 
 mat_mul_f32::m032        3,513            3,013                    -500  -14.23% 
 mat_mul_f32::m064        20,804           18,757                 -2,047   -9.84% 
 mat_mul_f32::m127        143,522          124,790               -18,732  -13.05% 
 mat_mul_f32::m256        1,029,880        904,001              -125,879  -12.22%