google / benchmark

A microbenchmark support library
Apache License 2.0
8.59k stars 1.57k forks source link

[BUG] benchmark::DoNotOptimize does not prevent a reference from being optimized away. #1773

Open HFTrader opened 3 months ago

HFTrader commented 3 months ago

Describe the bug benchmark::DoNotOptimize() does not prevent a reference from being optimized away.

System OS: Ubuntu 20.04 LTS, Compiler explorer Compiler: GCC 9.04, GCC trunk, Clang trunk

To reproduce Compiler Explorer that works: https://godbolt.org/z/c3d5sc8s3 Compiler Explorer that does not: https://godbolt.org/z/3bMMvaecr

Alternatively, compile the following code

#include <vector>
#include <benchmark/benchmark.h>
static void bm_traverse(benchmark::State& state) {
    const size_t N = 500;
    std::vector<int> vec;
    for (size_t j = 0; j < N; ++j) vec.push_back(j);
    for (auto _ : state) {
        for (auto& value : vec) {
            asm("nop"); // just to locate asm point faster
            benchmark::DoNotOptimize(value);
        }
        benchmark::DoNotOptimize(vec);
    }
    state.SetComplexityN(N);
}
BENCHMARK(bm_traverse);
BENCHMARK_MAIN();

Expected behavior DoNotOptimize should have preserved the loop read. I understand that the reference is basically a pointer but users typically would expect that the reference represents the actual lvalue.

Screenshots

With the "&" reference in the inner loop, a vec load is not generated.

.LBB0_31:                               #   Parent Loop BB0_6 Depth=1
        nop
        addq    $4, %rax
        cmpq    %rcx, %rax
        jne     .LBB0_31

If you remove the reference the vec load is generated.

.LBB0_31:                               #   Parent Loop BB0_6 Depth=1
        movl    (%rax), %edx   # <<<< HERE
        movl    %edx, 12(%rsp)
        nop
        addq    $4, %rax
        cmpq    %rcx, %rax
        jne     .LBB0_31

Additional context

I understand this is a matter of semantics that might have been discussed in the past. In this case, I'd just like confirmation of this behavior.

LebedevRI commented 3 months ago

IMHO in this particular case the

Compiler Explorer that works: https://godbolt.org/z/c3d5sc8s3

... is the case that actually does not work. There is zero reason for the compiler to produce that load-and-spill there.

HFTrader commented 3 months ago

From my user point of view, DoNotOptimize() means: do something with this lvalue such that it is a noop but the compiler will still think we are using it. So if you are using the value, the DoNotOptimize() is doing its job.

HFTrader commented 3 months ago

That all said, this is your library so the semantics of using it is what you say. I was surprised by it. If that's known and expected then it is what it is.

LebedevRI commented 3 months ago

Right, but still, i don't quite see it. A reference is a pointer, while an auto is a result of dereferencing that pointer. If you drop DoNotOptimize from either snippet, there's a lot more changes, so DoNotOptimize did prevent the argument from being optimized away.