Add seed parameter - Githubissues

gustaphe commented 9 months ago

Partial solution for #201 . Since it needs to be serializable, I opted for just a seed number. seed! is not super well documented, so Int may not be the perfect type, but it's pretty handy. Negative seeds appear to be invalid, so I could use that as a stand-in for nothing, as Union{Int, Nothing} ruined serialization.

[x] PR
[x] Complete tests
[x] Documentation

gustaphe commented 9 months ago

[x] Don't rely on 1.7 kwarg syntax

(I thought about this and forgot to fix it)

gustaphe commented 9 months ago

There is a problem in that this only works if the benchmarks take the same number of samples. I don't know if this should be enforced or marked as a caveat: "The same seed guarantees the same sequence of random numbers, but the benchmarks may still pick an unequal number of random numbers".

In practice, consistency increases even if it's not guaranteed to be perfect.

gdalle commented 9 months ago

Probably a dumb question, but why not give a seed to each benchmarkable instead?

gustaphe commented 9 months ago

I guess one could. But the seed needs to be set before an entire benchmark run, not for each sample. And if it's not for a group you can just run seed! before running the benchmark.

An alternative is to add a hook to run, like run(group; setup=(seed!(1234))) - I don't have a strong opinion.

gdalle commented 9 months ago

Given this recent Discourse thread, I think a seed for each benchmarkable would also make sense:

https://discourse.julialang.org/t/making-benchmark-outputs-statistically-meaningful-and-actionable/104256/

What do you think @gustaphe ?

gustaphe commented 9 months ago

I didn't think that would work, but it took me like a second to convince myself it would. If nobody's made a PR like that by Saturday I will

gdalle commented 9 months ago

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

gustaphe commented 9 months ago

Wait, isn't it already possible to set the seed in the setup code of each benchmarkable?

No. If you set the seed in the setup, it gets reset for every sample, not just at the start of the benchmark run. Consider

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1234))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  1.531 μs … 48.249 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.739 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.836 μs ±  1.147 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▃▆██▅▄▃▁
  ▂▂▂▃▃▅▇█████████▇▆▅▄▄▃▃▃▃▃▃▃▃▂▃▂▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▂▁▂▁▁▁▂ ▃
  1.53 μs        Histogram: frequency by time        2.57 μs <

 Memory estimate: 192 bytes, allocs estimate: 5.

vs

julia> b = @benchmarkable sleep(rand([0, 0.5])) setup=(Random.seed!(1235))
Benchmark(evals=1, seconds=5.0, samples=10000)

julia> r = run(b)
BenchmarkTools.Trial: 10 samples with 1 evaluation.
 Range (min … max):  501.700 ms … 502.117 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     501.726 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   501.802 ms ± 156.542 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █           ▃
  █▇▇▁▇▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▇ ▁
  502 ms           Histogram: frequency by time          502 ms <

 Memory estimate: 192 bytes, allocs estimate: 5.

each of these runs only ever use a single result from rand. What you want is a series of random numbers, but in a repeatable sequence.

gustaphe commented 9 months ago

I would guess that the most useful use of this parameter is supplying it as run(::BenchmarkGroup; seed=something), but this really looks like the best implementation.

rfourquet commented 8 months ago

I'm not clear on the details here, but instead of asking for a seed, why not copy the state of the global RNG at the beginning rng_copy = copy(Random.default_rng()), and then before each benchmark run, restore this RNG state (copy!(Random.default_rng(), rng_copy). This is way faster than seeding the RNG, and could even be done by default. If the use then wants to try a specific seed, then seed!(seed) can be done manually at the beginning.

EDIT: this is what @testset does BTW.

Seelengrab commented 8 months ago

I'd go one step further and let the user provide the entire RNG object they want to use, defaulting to a state-resetting copy of the default RNG.

gdalle commented 8 months ago

@gustaphe what do you think of this proposal?

gustaphe commented 8 months ago

I've been waiting for a good time to read up on this/try it out but haven't found it. The obstacles I see are:

Serialization - as is, benchmark objects are serialized, by some method I don't quite understand, so using an integer worked out-of-the-box but the other methods (possibly even the suggestion of using nothing as a sentinel) don't.
Is there a good way to supply an RNG object and have it reset every time? Because it's no use if this is just an RNG that is set once and then evolves throughout the experiment.

Like I said, I don't know if these are reasonable objections, I just haven't had time to think carefully about them.

Seelengrab commented 8 months ago

Serializing an e.g. Xoshiro should be perfectly possible, it just needs to serialize the internal state & then reconstruct the RNG object from that state. I'm kind of surprised this isn't already implemented in Base. Resetting it every time means just copying the existing RNG object before the benchmark, and using the copy for the benchmark. I'd definitely make that toggleable though.

JuliaCI / BenchmarkTools.jl

Add seed parameter #335