Implement "delta" measurement

travisdowns / uarch-bench

A benchmark for low-level CPU micro-architectural features

MIT License

679 stars 59 forks source link

Currently we just measure the absolute time of the code under test like so:

static int64_t time_method(size_t loop_count) {
    auto t0 = CLOCK::now();
    METHOD(loop_count);
    auto t1 = CLOCK::now();
    return t1 - t0;
}

The downside of this approach is that it includes the time for one CLOCK::now() call as well as all the overhead of METHOD(loop_count) which includes at least a call and ret and sometimes a small amount of setup overhead.

A better approach is to time the loop with two different loop_count and use the difference in time to calculate the performance. This causes the above overheads to cancel out (but the test/jump overhead inside the loop within the benchmark is still present, but this is small or sometimes zero).

travisdowns / uarch-bench

Implement "delta" measurement #10