Pick the best of multiple runs

BurntSushi / cargo-benchcmp

A small utility to compare Rust micro-benchmarks.

The Unlicense

342 stars 21 forks source link

Pick the best of multiple runs #25

Open bluss opened 8 years ago

bluss commented 8 years ago

One relatively simple way to paper over variability of benchmarks (for example cpu warmup-related things) is to pick the best time of multiple runs. The tool could allow having multiple input files for both before and after.

Apanatshka commented 8 years ago

Ideally cargo bench would output info of all the runs that it does, then we can do this as well as use more rigorous statistics (#4). Sadly I don't currently have the bandwidth to contribute that option to cargo bench, nor do I quite understand the stabilisation story around the feature (which makes me hesitant to spend time on it at all).

But that's a general comment. More particular to your request: Can you elaborate a little bit when you'd want to paper over variability?

bluss commented 8 years ago

In a way it's just to run the benchmarks more times, to have more attempts at getting a stable timing.

I don't know if it helps you, but stable releases of Rust can run "cargo bench" if you configure the crate to use no default harness for that and have some replacement benchmark framework. For this reason, crate matrixmultiply outputs cargo-benchcmp compatible output using "cargo bench" with a stable Rust release.

Apanatshka commented 8 years ago

That is very interesting. I didn't know that was possible. I want to look into that then, but not sure when I'll have time. I'll try to at least look into this (and report on it) this year. Unless someone else beats me to it of course ;)

bluss commented 7 years ago

Here's how I solve this problem so far. For picking the best of multiple runs, there's a simple script to merge two or more benchmark files, picking the best.

Defeat noise and cpu freq scaling by running multiple times
I deal with deterministic, reproducible benchmarks. The best time, not the average, is interesting.

https://gist.github.com/bluss/d8d65ecb093fa324de77eb145e83cee8