bheisler / iai

Experimental one-shot benchmarking/profiling harness for Rust
Apache License 2.0
563 stars 35 forks source link

Use hardware performance counters instead of cachegrind #11

Open asomers opened 3 years ago

asomers commented 3 years ago

Iai is very exciting! I love the idea of benchmarks that are fast and deterministic. But relying on Cachegrind has some drawbacks:

Modern CPUs contain hardware performance counters that can be used for nearly zero-cost profiling. Using those instead of Iai would have several benefits:

On FreeBSD, pmc(3) provides access to the counters, and there is already a nascent Rust crate for them: pmc-rs. On Linux, I think the perfcnt and perf crates provide the same functionality.

shepmaster commented 3 years ago

I think that https://github.com/jbreitbart/criterion-perf-events is an attempt to do that.

asomers commented 3 years ago

cool! Thanks for the tip.

bheisler commented 3 years ago

Yes, if that's what you want I would recommend using the criterion-perf-events plugin.

Cachegrind is used specifically for its emulation of the memory hierarchy. Because we can control the parameters of that emulation, Iai can take measurements under cachegrind that should be far more repeatable and consistent between machines than are possible even with performance counters. Hardware performance counters will naturally be different between different hardware.

In addition, under virtualization it's common for access to the performance counters of the underlying hardware to be disabled, so it's not as if that approach is without drawback either. I know this is the case, because the VM I do my work in at my day job has its performance counters disabled for mysterious IT-department reasons.