Closed ivoanjo closed 2 years ago
Huh! Out of curiosity, benchmark-ips also isn't very happy if the test time is too short:
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'benchmark-ips'
end
Benchmark.ips do |x|
x.config(:warmup => 1, :time => 0.000000001)
x.report("addition") { 1 + 2 }
x.compare!
end
result:
$ ruby bench-ips-testcase.rb
Warming up --------------------------------------
addition 2.741M i/100ms
Calculating -------------------------------------
additionTraceback (most recent call last):
11: from bench-ips-testcase.rb:9:in `<main>'
10: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips.rb:53:in `ips'
9: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:254:in `run'
8: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:254:in `times'
7: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:255:in `block in run'
6: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:312:in `run_benchmark'
5: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:312:in `each'
4: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:351:in `block in run_benchmark'
3: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:367:in `create_stats'
2: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/job.rb:367:in `new'
1: from /Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/ips/stats/sd.rb:11:in `initialize'
/Users/ivo.anjo/.rvm/gems/ruby-2.7.5/gems/benchmark-ips-2.9.2/lib/benchmark/timing.rb:12:in `mean': undefined method `/' for nil:NilClass (NoMethodError)
To add context on why anyone would want to run a "benchmark" with low warmup (and low time), it's because we run benchmarks separately from our usual test suite, and it happened in the past that we broke our benchmarks but CI was still green.
So now we run the benchmarks for a really short time in CI just to validate that they're working, and get faster feedback when we broke them.
That warmup is 0.1ns and the time 1ns, it's basically impossible to measure such small times, i.e., clock_gettime actually takes longer to run. Do you still get the error if you use something more reasonable like 1ms?
Ah from the linked PR you actually used 1ms. Then it might be more realistic to fix this, although I don't remember the details. Not sure how it's possible we get 0 samples, should be at least one.
Yup, we were using 1ms, I'm guessing because of noisy neighbors (container inside VM inside possibly-shared-CI-machine) we got unlucky from time to time.
I reduced it a lot in the example to trigger it with certainty.
I suspect the issue may come from https://github.com/evanphx/benchmark-ips/blob/ecc3a7d3b50576684073f55ac3f7e8efdd51e204/lib/benchmark/ips/job.rb#L275-L281 -- if target
is small enough, then by the time it gets to the check, time's over, and the loop never runs.
Makes sense to fix. Agreed that "we should get at least one sample" is a common-sense fix here.
That looks like a bug in the code I added there, whoops. I can try to make a PR but I'm not sure when I'll have time. If anyone else want to fix it I'm happy to review.
Also for your use case and with that fix, x.config(warmup: 0, time: 0)
is probably the best because that runs a single iteration, a non-zero warmup means at least 2 iterations total.
Also for your use case and with that fix, x.config(warmup: 0, time: 0) is probably the best because that runs a single iteration, a non-zero warmup means at least 2 iterations total.
👍 Thanks for the tip and thanks for the fix :)
Hello there, and as always, thanks for the amazing job on benchmark-ips!
I noticed this paper cut on one of the tests we do on Datadog's ddtrace gem: if the warm up time is "small enough", we can trigger a division by
0.0
.Here's an extreme example:
And the result:
I ran into this on a test where we didn't need a lot of warmup, but did not disable it either -- only set it low-ish enough that it ended up working most of the time, but of course, caused a flaky test in CI.
The correct solution in our case is to disable the warmup (which I'll do) but I thought I'd report it anyway :)