bheisler / criterion.rs

Statistics-driven benchmarking library for Rust
Apache License 2.0
4.52k stars 301 forks source link

Normalizing results across different hardware? #684

Open Galgamins opened 1 year ago

Galgamins commented 1 year ago

Wanted to ask if this was viable.

We include one benchmark with criterion itself. Something simple.

Whenever someone runs criterion on their own rust repos, the included benchmark runs first. The difference in results of the included benchmark is used to normalize the output of other benchmarks that the user added, so that hardware differences on output are normalized away.

E.g.

1) I add a bench1 to my repo. Criterion comes with a bench0. 2) I run my benchmarks on Hardware A. bench0 takes 1ms on average, bench1 takes 10ms on average. I check these results into my repo. 3) Another person on my team pulls my changes, adds some of their own, and runs the benchmarks on Hardware B. bench0 takes 1.5ms on average, bench1 takes 25ms on average. 4) Since we know bench0 took on average 1ms on Hardware A and 1.5ms on Hardware B, we normalize the result of bench1 on Hardware A to 15ms. Then we compare it to the result of bench1 on Hardware B, which is 25ms, and say the benchmark has regressed by 10ms.