martinus / nanobench

Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20
https://nanobench.ankerl.com
MIT License
1.43k stars 82 forks source link

Is it possible to do some cleanup that is not included in measurements after each loop? #75

Closed a-cristi closed 2 years ago

a-cristi commented 2 years ago

There is one naïve approach that I see for doing this:

- Bench& Bench::run(Op&& op) {
+ Bench& Bench::run(Op&& op, PostOp &&post) {
     // It is important that this method is kept short so the compiler can do better optimizations/ inlining of op()
     detail::IterationLogic iterationLogic(*this);
     auto& pc = detail::performanceCounters();

     while (auto n = iterationLogic.numIters()) {
         pc.beginMeasure();
         Clock::time_point before = Clock::now();
         while (n-- > 0) {
             op();
         }
         Clock::time_point after = Clock::now();
         pc.endMeasure();
         pc.updateResults(iterationLogic.numIters());
         iterationLogic.add(after - before, pc);
+        
+        post();
     }
     iterationLogic.moveResultTo(mResults);
     return *this;
 }

I call this naïve because it will not execute post once after each op and I also don't know what other implications this change has (how it interacts with the actual measurements, how and if it affects the BigO stuff, etc), and if it is compatible with the philosophy of the project.

Executing post once after each op seems impossible without more drastic changes.

And since this feels like the XY problem, I'll give some context: I'm trying to measure some code that does a syscall. Due to the nature of that syscall I also need to do some cleanup, and I don't want to include the cleanup in the measurements. I can do cleanup in batches, so the above patch is good enough for me (even if it is not good in a general sense).

While measuring this kind of code feels out of scope for nanobench, I like it a lot (easy to use, easy to configure, warns me if results are unstable, I can analyze and compare results with pyperf) and I can't find another similar library that checks all the boxes it checks.

The patch I posted is really specific to my problem, but I'm curious if it has any downsides that will mess with the measurements (it seems to me that it does not, but I'm not familiar enough with the code base to feel confident in saying this), or if someone who has more experience in benchmarking can recommend a better approach.

martinus commented 2 years ago

This has been requested several times already, but the problem is that this would force nanobench to start & stop the timers each iteration. When the thing to be measured is fast compared to timer accuracy or start/stop time, the results will be very unreliable.

The way to correctly deal with this is to create two benchmarks: One measures everything, and another one that tries to measure just the overhead (e.g. just setup & cleanup), then calculate the difference.

a-cristi commented 2 years ago

I see. This makes sense. In my case the thing I'm measuring is probably slow enough for this to not be an issue, but measuring the cleanup sounds better so I'll try that. Thanks! :)

martinus commented 1 year ago

See #86