Closed ioquatix closed 4 years ago
Another interesting idea would be to run the benchmark with times=0
to compute the overhead and subtract that from subsequent test runs.
@ioquatix I tend to setup the connection outside my benchmarks. So I can get as little setup inside my loop.
Is it possible for you to do something like that?
@kbrock thanks for your reply.
This was my ultimate conclusion - trying to do setup work throws the system off too much. I think we can close this issue.
That being said, for further discussion:
Setup is still part of the overall cost.
It also makes the comparisons less isolated because setup is done outside of the benchmark. In some ways, it makes sense if you just want to compare the operational overhead, but if you want to compare the full cost of two different implementations, it's tricky.
I did try using plain benchmark with fixed repetitions, and I think maybe for my use case, that makes more sense, w.r.t. testing setup as well as overhead. But it lacks the instructions/second which is pretty useful, and the formatting is less pretty, warmup phase, etc
@ioquatix yea. I've seen some nice benchmarks that show the before warmup numbers and the after warmup numbers. So you can get an idea if a cache is used, or connection pooling. Very nice stuff.
This does provide the warmup phase, but isn't setup to record/graph the warmup.
One way you could do it is inject into the benchmark some "timer" object, e.g.
x.benchmark do |repeats, measure|
client = connect_to_server
measure.setup! # measure the time it took to setup
index = 0
while index < repeats
client.get(...)
measure.warmup
end
end
There are other ways to do this, maybe with a better interface. But that's roughly how it could work.
Another proposal/idea:
x.benchmark(timer: true) do |timer|
client = connect_to_server
timer.capture do |repeats|
index = 0
while index < repeats
client.get(...)
end
end
end
You could capture multiple blocks, e.g.
x.benchmark(timer: true) do |timer|
client = connect_to_server
timer.capture("get") do |repeats|
index = 0
while index < repeats
client.get(...)
end
end
timer.capture("post") do |repeats|
index = 0
while index < repeats
client.post(...)
end
end
end
but maybe that's getting too complicated.
x.benchmark(timer: true) do |timer|
client = connect_to_server
timer.capture do |repeats|
index = 0
while index++ < repeats
client.get(...)
end
end
end
is basically
client = connect_to_server
x.benchmark(timer: true) do |timer|
timer.capture do |repeats|
index = 0
while index++ < repeats
client.get(...)
end
end
end
and while that is not capturing the setup (connect_to_server
) it is basically what I tend to do.
And yes, it does not capture the amount of time for warmup.
But this is called "iterations per second" - so it does loose the nuance of warming up and getting up to speed.
I agree with you and I realise my proposals are adding more complexity without a huge gain in functionality. The main point is to keep all the benchmark code within one block. To avoid leaking state. It gets a bit ugly when you have a ton of setup code, especially if different benchmarks shares similar concepts. Because it's not clear where one benchmark ends and the next one starts. It all gets mixed up and variable names are used to untangle it at the top level scope.
@ioquatix Maybe you misunderstood. The timer
block just provides a reference to the capture
method. Moving a method into that block or keeping it out is fine / does the same thing. The code in that block are there to setup the benchmark. Any code you put in there will be run before the benchmarks actually begin.
I get it. It’s just a shared name space for all benchmarks so if I have two very similar things but different implementations I have to call them e.g. client_implementation_x
and client_implementation_y
so they don’t clash and it just gets a bit messy.
@eregon this is why the warmup cycles needs to be more than one.
I think it's good enough to take the setup outside of the report
block here.
And so I'd suggest to close this issue, moving the setup outside the report
is the recommended approach.
If you want to isolate local variables, you can always use an extra lambda as a scope:
Benchmark.ips do |x|
-> {
client = connect_to_server
x.report "mybench" do
client.get(...)
end
}.call
end
I'm interested in comparing two (or more) implementations of the same thing.
There is some "one off" setup cost - establishing the connection, and then the repeated
times
cost, i.e. what I'm interested in.For one benchmark, the setup cost might be N and for the other, 2N. The iteration cost is largely the same. However, what ends up happening is that for the benchmark with N setup overhead, the
times
repeats is much larger than the benchmark with 2N overhead. This causes the effect of the setup to be even more pronounced, becausebenchmark-ips
will settimes
to, say, 80, for the case of 2N setup cost, and 500 for the case of N setup cost, so you end up with:Ultimately, it makes the 2nd case look much better even though the different is mostly in the setup overhead.
Is there some way to take this bias into account? My initial thoughts were to use the upper bound for
times
(or perhaps the average) so that each benchmark would be largely running with the same number, proportionally, of setup overhead to number oftimes
.