sharkdp / hyperfine

A command-line benchmarking tool
Apache License 2.0
20.69k stars 333 forks source link

Added option to manually specify a reference to compare the results to. #744

Closed JanEricNitschke closed 15 hours ago

JanEricNitschke commented 2 months ago

I was interested in the functionality discussed in https://github.com/sharkdp/hyperfine/pull/579 and https://github.com/sharkdp/hyperfine/issues/577.

Sadly the former PR has been stale for over a year.

So i have tried to effectively take the work from there, make it fit with the changes that have happend since then and also address the comments in the PR.

There are currently still two points that i am myself unsure about.

I have used the standard std::cmp::Ordering for the ordering compared to the reference. However, i feel that the naming is not actually clear at a first glance. Namely that Ordering::Less corresponds to a faster time (the runtime is less). It might be worth to create a distinct enum that maps to this (converts from) as suggested in the original PR review.

I am also not sure if the statements faster than and slower than are the best. Particularly the latter feels a bit unintuitive to me. Maybe something like 0.2 times as fast would be better than 5.0 times slower?

sharkdp commented 1 week ago

Thank you very much for your contribution. Sorry for the delayed review.

This looks very good from a first glance. How does this new --reference option interplay with parametrized benchmarks?

Also, please consider creating (integration) tests for this new option.

JanEricNitschke commented 1 week ago

Thanks for having a look.

The reference is specified separately from all other options, so also separately from a scan and has to be specified fully.

$ hyperfine --reference "sleep 0.4" --parameter-scan delay 0.3 0.7 -D 0.2 'sleep {delay}'
Benchmark 1: sleep 0.4
  Time (mean ± σ):     430.1 ms ±   1.3 ms    [User: 0.0 ms, System: 19.1 ms]
  Range (min … max):   426.9 ms … 431.3 ms    10 runs

Benchmark 2: sleep 0.3
  Time (mean ± σ):     321.3 ms ±   1.0 ms    [User: 2.9 ms, System: 9.0 ms]
  Range (min … max):   319.8 ms … 322.9 ms    10 runs

Benchmark 3: sleep 0.5
  Time (mean ± σ):     523.5 ms ±   1.2 ms    [User: 2.9 ms, System: 10.5 ms]
  Range (min … max):   521.6 ms … 525.1 ms    10 runs

Benchmark 4: sleep 0.7
  Time (mean ± σ):     726.4 ms ±   1.7 ms    [User: 0.0 ms, System: 13.1 ms]
  Range (min … max):   722.8 ms … 728.3 ms    10 runs

Summary
  sleep 0.4 ran
    1.34 ± 0.01 times slower than sleep 0.3
    1.22 ± 0.00 times faster than sleep 0.5
    1.69 ± 0.01 times faster than sleep 0.7

Will have a look at addings tests. Thanks again!

JanEricNitschke commented 1 week ago

Currently working on the tests. Was thinking about how --prepare, --conclude and --command-name should work.

Currently i have modified it so that prepare and conclude affect the reference and the number needed is (n_commands + 1/0).

For command name the easiest would be to not affect the reference, but instead add a separate option --reference-name.