Open christianhujer opened 2 years ago
Thank you for your request.
I think the implementation of such a feature would require a LOT of special cases downstream to properly handle the absence of values.
Have you considered using sth like timeout
to limit the time (in combination with hyperfines --ignore-failure
option to ignore the nonzero exit code)? That would not show sth. like aborted or infinity, but it will run into the time limit and show that:
▶ hyperfine --ignore-failure -L time 1,5 'timeout 2 sleep {time}'
Benchmark 1: timeout 2 sleep 1
Time (mean ± σ): 1.002 s ± 0.000 s [User: 0.001 s, System: 0.002 s]
Range (min … max): 1.001 s … 1.002 s 10 runs
Benchmark 2: timeout 2 sleep 5
Time (mean ± σ): 2.001 s ± 0.000 s [User: 0.002 s, System: 0.001 s]
Range (min … max): 2.001 s … 2.002 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
'timeout 2 sleep 1' ran
2.00 ± 0.00 times faster than 'timeout 2 sleep 5'
In general: why would you be interested in including benchmarks that would potentially run into a time limit?
see also: #106
To answer the question "why would I be interested in including benchmarks that would potentially run into a time limit?"
I am benchmarking a matrix of languages and programs automatically.
From my Makefile
:
ALL:=$(patsubst %/,%,$(filter-out \
asm-m68k-amiga-gasm/ \
asm-m68k-amiga-masm/ \
asm-m68k-amiga2-masm/ \
Carbon/ \
Concurnas/ \
Logo/ \
, $(wildcard */)))
.PHONY: hyperfine-roundtrip
hyperfine-roundtrip: hyperfine-roundtrip.csv
hyperfine-roundtrip.csv:
hyperfine --export-csv hyperfine-roundtrip.csv -L variant $(shell echo $(ALL) | sed -e 's/ /,/g') -p 'make -C {variant} clean' 'make -sC {variant}'
I think you can see how well hyperfine
works for this case. ❤️
Before hyperfine
, my Makefile
looked like this:
.PHONY: time-%
time-%:
@for ((i = 0; i < 10; i++)); do
@$(MAKE) -s -C $* clean 2>&1
@start=$$(date -u +'%s%N')
@$(MAKE) -s -C $* >/dev/null 2>&1
@end=$$(date -u +'%s%N')
@echo '$*,'$$(($$end - $$start))
@done
time.csv:
echo 'Language,time (ns)' >$@
$(MAKE) -s time >>$@
clean::
$(RM) time.csv
time-processed.csv: time.csv
sqlite3 >>$@ <<END
.mode csv
.import time.csv times
select "Language", ((1.0 * sum("time (ns)") - max("time (ns)") - min("time (ns)")) / (count("time (ns)") - 2.0)) / 1000000 as "Time (ms)" from times group by "Language" order by "Time (ms)";
END
Using timeout
as a wrapper will work from a functional perspective. The measurement would no longer be just the target program, but timeout
plus the target program. One would have to benchmark timeout
itself also and then subtract that value, by first running a benchmark on true
, then running a benchmark on timeout true
and subtracting the benchmark of true
from it. That's why having this feature in hyperfine
itself would be great.
For a lot of purposes, timeout
will work fine, this feature is not essential.
It only matters where some of the results will be so low/fast that the time it takes to run timeout
(I guess 3-6ms) makes a significant difference.
(I'm measuring roundtrip times of programming languages, and they can range from a few ms in Perl or Assembler to many seconds like Flix, and it also heavily depends on the problem statement.)
For a lot of purposes,
timeout
will work fine, this feature is not essential. It only matters where some of the results will be so low/fast that the time it takes to runtimeout
(I guess 3-6ms) makes a significant difference.
Don't guess - measure :smile:
Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
---|---|---|---|---|
fd |
12.2 ± 0.9 | 10.5 | 14.7 | 1.00 |
timeout 2s fd |
12.9 ± 0.8 | 11.1 | 15.7 | 1.06 ± 0.10 |
The overhead seems to be below 1 ms.
For benchmarking, we already have
-P
,--parameter-scan
, and its more flexible counterpart-L
,--parameter-list
, and that's great. As the example shows, we can use this to runhyperfine
like this:I've now found a use case where a possibility to have
hyperfine
limit the benchmarking runs based on elapsed time could be useful.For benchmarks where performance varies greatly, like between C and bash, it could occasionally be useful to present results as "aborted (took too long)" by having a
--max-time-per-run <TIME>
argument, for example--max-time-per-run 2s
, that will automatically terminate a run and its associated benchmark when its runs take longer than--max-time-per-run
. The values could be output as∞
.