Closed kozross closed 3 years ago
It's a bit unclear to me, what design you have in mind. How would your benchmark look, if you agree to measure the PRNG overhead?
It would be something like:
bench "Foo" . nfAppIO (\h -> doTheSearch <$> genRandomNeedle <*> pure h) $ haystack
However, this will end up benchmarking both genRandomNeedle
and the search for it, whereas I only want to benchmark the search. Furthermore, I only need IO for the random generation, not the search. The idea is that as the search is run many times during the benchmarking process, by randomizing the needle I get a good idea of 'average' performance.
Unfortunately, I don't think there is a reasonable workaround for this particular scenario. Both individual generation of a needle and the search itself are well beyond the granulatiry of a system timer, so we cannot start-stop it.
FWIW I think that genRandomNeedle
implemented via http://hackage.haskell.org/package/random-1.2.0/docs/System-Random-Stateful.html#g:11 should be blazingly fast and thus would not affect overall measures in a significant way.
In general I'm inclined to advise against such design: to achieve reliable results with a predictable deviation, it's crucial to benchmark exactly the same data over and over again. Maybe generate several thousands of needles at top level and iterate over them all in a single benchmark?..
Yeah, I think you're right on all counts. However, if I generate all the needles at the top level, then do them all in one bench, I'll get a result for a thousand needles, not the average for one needle. Unless I'm missing something?
Well, you can divide by thousand to approximate the average. I agree that it's not that convenient, but absolute numbers are of limited value anyways.
Fair enough. Thanks for the advice!
A different data point - what if I want to test an external system that needs restarts, preparation or warmup/cache clearing before benchmarking? This is not something one can do outside of the testing loop. This is also something that for example hyperfine does provide.
Does env
run for each benchmark or for each timing run?
env
runs once, before measurements of a particular routine take place.
I want to benchmark a search algorithm implementation. As a result, I'd like to generate random needles for each iteration of the bench. However, I don't want to also measure the PRNG. Is there a way to do this currently?