Closed philipc closed 2 years ago
Use structs like AvxF32x8 as a token to signify that target features have been checked.
AvxF32x8
Some performance improvements possibly due to inline changes.
Note: this causes a 16% performance regression in forward_f64x4 which will be tracked as a separate issue.
forward_f64x4
forward_f32x8 time: [30.348 ms 30.359 ms 30.370 ms] change: [-7.2267% -7.1551% -7.0925%] (p = 0.00 < 0.05) Performance has improved. forward_f32x16 time: [15.857 ms 15.863 ms 15.870 ms] change: [-2.6080% -2.5394% -2.4751%] (p = 0.00 < 0.05) Performance has improved. forward_f64x4 time: [58.340 ms 58.404 ms 58.485 ms] change: [+16.294% +16.865% +17.226%] (p = 0.00 < 0.05) Performance has regressed. forward_f64x8 time: [25.140 ms 25.145 ms 25.149 ms] change: [-4.0999% -3.8240% -3.5566%] (p = 0.00 < 0.05) Performance has improved.
Use structs like
AvxF32x8
as a token to signify that target features have been checked.Some performance improvements possibly due to inline changes.
Note: this causes a 16% performance regression in
forward_f64x4
which will be tracked as a separate issue.