Closed lucab closed 5 months ago
On my Linux laptop with AMD Ryzen 7, this shows a small but reliable micro-benchmark improvement:
Finished release [optimized] target(s) in 16.03s
Running benches/unmask.rs (target/release/deps/unmask-c8256fdf566f5dbf)
unmask2/unmask 64 << 20 time: [2.3568 ms 2.3596 ms 2.3641 ms]
thrpt: [26.437 GiB/s 26.488 GiB/s 26.519 GiB/s]
change:
time: [-8.4860% -7.2746% -6.0839%] (p = 0.00 < 0.05)
thrpt: [+6.4780% +7.8453% +9.2729%]
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
This tweaks
unmask_easy
logic in order to directly use an enumerating iterator. It allows the compiler to generate more optimal machine code by performing internal iteration directly on the input slice.