Bodigrim / tasty-bench

Featherlight benchmark framework, drop-in replacement for criterion and gauge.
https://hackage.haskell.org/package/tasty-bench
MIT License
80 stars 11 forks source link

Allow running benchmarks a given number of times. #43

Closed AndreasPK closed 1 year ago

AndreasPK commented 1 year ago

With criterion based benchmarks I often found the -n option helpful to run them a fixed number of times without doing any analysis or the like.

This can be useful to produce a profile of a given benchmark, get a clearer idea of the cost of the benchmark vs the overhead from analyzing the results or just for heating your room.

So it would be nice if tasty-bench would also support this feature.

Bodigrim commented 1 year ago

If you do not need any analysis, you can just invoke measureCpuTime in a loop.

Alternatively, beyond --stdev and --timeout, there is debug build flag to dump raw measurements.

AndreasPK commented 1 year ago

For me when I want to use the "number of runs" feature usually it's to produce comparable runs for either third party profiling tools, ticky, or even just cost center profiling. Something where the ability to control the exact number of runs executed has been quite helpful for me personally in the past. With tasty-bench becoming more popular it would be great to still have this tool around.

As I understand it both --stdev and --timeout allow only rough control over the number of runs which is very hit and miss for the use cases I'm thinking off and debug simply gives more information about the runs that where performed. Which seems useful in other scenarios but not for the kind of cases where I used to use criterions -n option.

Looking at the docs measureCpuTime also performs potentially multiple runs. Although I imagine setting the timeout to zero might result in one run per loop? Either way I'm not thinking about cases where I want to make anything but the most trivial changes to the code I'm looking at myself. So anything that involves modifying the code itself doesn't replace the -n feature very well.

Bodigrim commented 1 year ago

Looking at the docs measureCpuTime also performs potentially multiple runs. Although I imagine setting the timeout to zero might result in one run per loop?

https://github.com/Bodigrim/tasty-bench/blob/fd30b627a6d6c12f98f071e8372315b81dd4b57e/src/Test/Tasty/Bench.hs#L1073

How does -n work in criterion? How do terminal / html / csv backends behave?

AndreasPK commented 1 year ago

How does -n work in criterion?

It runs the selected benchmarks n times and nothing more.

How do terminal / html / csv backends behave?

From memory in the -n mode criterion won't produce any output and use of --csv or similar is incompatible with -n.

Bodigrim commented 1 year ago

This can be useful to produce a profile of a given benchmark, get a clearer idea of the cost of the benchmark vs the overhead from analyzing the results or just for heating your room.

On contrary to criterion, the statistical analysis in tasty-bench is very lightweight, it's just a handful of Word64 operations. So there is no significant overhead. Same applies to the performance profile: just run the benchmark as usual with an appropriate timeout / stdev, there are very few extra things happening.

I doubt -n can fit well into the architecture of tasty-bench. I find criterion design in this area very confusing: simply ignoring all other options in -n mode strikes me as unreasonable.

AndreasPK commented 1 year ago

I would still be happy with a -n option that produces output and does analysis if it allows me to fix the number of iterations.

Bodigrim commented 1 year ago

I appreciate your input, but unfortunately I don't feel like this feature belongs to the core. I'd like tasty-bench be smaller than it is, not larger.

A third-party plugin can easily wrap Benchmarkable into a newtype and define instance IsTest with an option -n and an appropriate runner, then everything else will work out automatically.

Bodigrim commented 1 year ago

I'm happy to guide anyone willing to implement this feature as a plugin or wrapper, but I don't feel it belongs to tasty-bench itself. Thus closing.