google / benchmark

A microbenchmark support library
Apache License 2.0
8.59k stars 1.57k forks source link

[BUG] Iteration count in Report looks suspicious #1770

Closed bognar-dev closed 1 month ago

bognar-dev commented 3 months ago

Describe the bug

When I run my Google Benchmarking the Results seem off. The iteration cout for one of the test fixutres is overflowing I think so instead of 1 iteration it shows 4046608 for calculateAverages and only 1 for readFile. I am expecting to see both Benchmarks to show a iteration count of 1. I am not using any recursion or multiple calls in my calculateAverage function so I am surprised to see these Results.

When I time the programm manually like:

auto start = std::chrono::high_resolution_clock::now();
    auto data = readFile();
    auto afterRead = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> elapsedRead = afterRead - start;
    calcAvr(data);
    auto afterCalc = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> elapsedCalc = afterCalc - afterRead;
    auto end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> elapsed = end - start;
    std::cout << "Reading time: " << elapsedRead.count() << "s" << std::endl;
    std::cout << "Calculation time: " << elapsedCalc.count() << "s" << std::endl;
    std::cout << "Execution time: " << elapsed.count() << "s" << std::endl;

My Calculation time is around 1.8 seconds and if I run it with goole benchmark then i get:

One_BRC/readFile          1.5273e+11 ns   1.5268e+11 ns            1
One_BRC/calculateAverages        178 ns          178 ns      4046608

This doesn't seem correct I tried it with milliseconds but it doenst change anything. System Which OS, compiler, and compiler version are you using:

To reproduce Steps to reproduce the behavior:

My Benchmark Suite:


#include <benchmark/benchmark.h>
#include <chrono>
#include "calculations.h"

class One_BRC : public benchmark::Fixture {
protected:
    std::map<std::string, std::vector<float>> _data;
public:
    void SetUp(::benchmark::State &state) {

    }

    void TearDown(::benchmark::State &state) {
    }
};

BENCHMARK_F(One_BRC, readFile)(benchmark::State &st) {
    for (auto _: st) {
        _data = readFile();
    }
}

BENCHMARK_F(One_BRC, calculateAverages)(benchmark::State &st) {
    for (auto _: st) {
        calcAvr(_data);
    }
}
BENCHMARK_MAIN();

Console Output:

Run on (8 X 2295.62 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 64 KiB (x4)
  L2 Unified 512 KiB (x4)
  L3 Unified 4096 KiB (x1)
Load Average: 0.22, 0.38, 0.44
File opened successfully
Data read successfully
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
One_BRC/readFile          1.5273e+11 ns   1.5268e+11 ns            1
One_BRC/calculateAverages        178 ns          178 ns      4046608

Expected behavior Both iteration counters showing 1.

Additional context Here are both of my functions, these are written poorly becuase I wanted a naive benchmark to start with my Billion row challenge.


std::map<std::string, std::vector<float>> readFile() {
    //read the csv file from data folder
    std::ifstream file("../data/measurements.txt");
    //check for errors
    if (!file.is_open()) {
        std::cerr << "Error: file not found" << std::endl;
        return {};
    }
    std::cout << "File opened successfully" << std::endl;
    std::string line;

    std::map<std::string, std::vector<float>> data;

    while (std::getline(file, line)) {
        std::string station;
        std::string temperature;
        std::stringstream ss(line);
        std::getline(ss, station, ';');
        std::getline(ss, temperature, ';');
        data[station].push_back(std::stof(temperature));
    }
    std::cout << "Data read successfully" << std::endl;
    file.close();
    return data;
}

void calcAvr(std::map<std::string, std::vector<float>> data) {
    for (const auto &station: data) {
        std::string stationName = station.first;
        std::vector<float> stationData = station.second;
        float minTemp = stationData[0];
        float maxTemp = stationData[0];
        float avgTemp = 0;
        for (int i = 0; i < stationData.size(); i++) {
            if (stationData[i] < minTemp) {
                minTemp = stationData[i];
            }
            if (stationData[i] > maxTemp) {
                maxTemp = stationData[i];
            }
            avgTemp += stationData[i];
        }
        avgTemp /= stationData.size() - 1;
        std::cout << "Station: " << stationName << " Min: " << minTemp << " Max: " << maxTemp << " Avg: " << avgTemp
                  << std::endl;
    }
}
dmah42 commented 3 months ago

my guess is that _data in your first method, and the result of the method call in the second, are being optimised out as unused. you need to used the DoNotOptimize method to ensure the compiler doesn't drop it.

bognar-dev commented 3 months ago

The Benchmarks are run one after another correct?

If I run the Program with DoNotOptimise and Clobber Memory the results do not not change, Or where do i call it?

class One_BRC : public benchmark::Fixture {
protected:
    std::map<std::string, std::vector<float>> _data;
public:
    void SetUp(::benchmark::State &state) {
        benchmark::DoNotOptimize(_data);
    }

    void TearDown(::benchmark::State &state) {
    }
};

BENCHMARK_F(One_BRC, readFile)(benchmark::State &st) {
    for (auto _: st) {
        _data = readFile();
        benchmark::DoNotOptimize(_data);
        benchmark::ClobberMemory();
    }
}

BENCHMARK_F(One_BRC, calculateAverages)(benchmark::State &st) {
    for (auto _: st) {
        calcAvr(_data);
        benchmark::DoNotOptimize(_data);
        benchmark::ClobberMemory();
    }
}

BENCHMARK_MAIN();
dmah42 commented 3 months ago

the fixture is not persistent between the benchmarks. so _data that is written in the first is empty in the second.

dmah42 commented 3 months ago

https://godbolt.org/z/YKqe17s3h

bognar-dev commented 3 months ago

I don't know if its stupid what I did but when I defined data outside the class it seems to work:


std::map<std::string, std::vector<float>> _data;
class One_BRC : public benchmark::Fixture {
protected:

public:
    void SetUp(::benchmark::State &state) {
    }

    void TearDown(::benchmark::State &state) {
    }
};

BENCHMARK_DEFINE_F(One_BRC, readFile)(benchmark::State &st) {
    for (auto _: st) {
        _data = readFile();
        benchmark::DoNotOptimize(_data);
    }
}

BENCHMARK_DEFINE_F(One_BRC, calculateAverages)(benchmark::State &st) {
    for (auto _: st) {
        calcAvr(_data);
    }
}

BENCHMARK_REGISTER_F(One_BRC, readFile);
BENCHMARK_REGISTER_F(One_BRC, calculateAverages);
BENCHMARK_MAIN();

Benchmark:

Run on (8 X 2295.62 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 64 KiB (x4)
  L2 Unified 512 KiB (x4)
  L3 Unified 4096 KiB (x1)
Load Average: 0.27, 0.15, 0.10
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
One_BRC/readFile             1675013 ns      1340255 ns          452
One_BRC/calculateAverages    2292709 ns       295267 ns         2323