Closed philipc closed 2 years ago
Also a fix for the C++ inlining, which reduces the difference from the Rust version:
align_i32x8 time: [32.851 ms 32.887 ms 32.928 ms]
change: [-13.771% -13.214% -12.835%] (p = 0.00 < 0.05)
Performance has improved.
align_i32x16 time: [27.418 ms 27.450 ms 27.483 ms]
change: [-4.7023% -4.5465% -4.3773%] (p = 0.00 < 0.05)
Performance has improved.
Notably a cpu feature fix that significantly improves the AVX512 implementation.