Benchmark manual samples and iterations count

catchorg / Catch2

A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)

https://discord.gg/4CWS9zD

Boost Software License 1.0

18.49k stars 3.02k forks source link

Benchmark manual samples and iterations count #1711

Open RT2Code opened 5 years ago

RT2Code commented 5 years ago

Currently, Catch2 Benchmark automatically compute the required iterations count for a benchmark to be accurate, and the sample count seems to be fixed to 100.

This is great. But in some cases, it would be very useful to be able to set this manually, especially the sample count. This is possible in Celero, and it would be great to have it too in Catch2.

sfranzen commented 5 years ago

It looks like the the sample count is already adjustable through the --benchmark-samples option, at least since Catch 2.9.0 I suppose. However, you're right that there is no such option for the iterations per sample, that could be added still.

RT2Code commented 5 years ago

I didn't know that, thanks.

VioletGiraffe commented 4 years ago

There should be a way to adjust the sample / iteration count automatically based on the estinated runtime. When I'm benchmarking insertion of 1000 items in my container I do want a large number of sample. When I'm inserting 10 million items and it takes 10 seconds for each sample, I don't want 100 samples.

ruben-arts commented 4 years ago

Is it possible to define the benchmark samples in an other form then the --benchmark-samples? I would like to be able to define the samples per benchmark. But I don't know how.

ivan236634452 commented 3 years ago

Just to add a use case for consideration. We use Catch2 benchmarks to optimize launch parameters for CUDA kernels. Roughly speaking we enumerate valid kernel launch configurations and for each of them run Catch2 session for all CUDA-tagged benchmarks and measure running time of all individual kernels that happen to execute (using microsecond-precise CUDA events). In this case we don't need any warm-up but require that the total number of iterations and samples is the same during each run. Currently we have to modify Catch2 to achieve that.

The simplest addition to Catch2 that would resolve our use case is a command-line argument to set the maximum allowed number of iterations to execute during both warm-up and sampling stages: setting it to 1 during parameter optimization should guarantee identical number of iterations per run.

Disabling warm-up (i.e. --benchmark-no-warmup) and specifying the number of iterations for all benchmarks might be too niche and complicated.

sudara commented 1 year ago

I would like to be able to define the samples per benchmark. But I don't know how.

This would be a valuable feature. I have some benchmarks that take much longer than others (and don't need to be as accurate), I'd like to set n to something more like 10.

jakemumu commented 1 year ago

popping in to vote for this feature something like:

BENCHMARK("NAME", samples: N) would be amazing

sudara commented 1 week ago

I'm curious how people are currently working around this in Catch2. Or did people with sample count needs move to something like google/benchmark?