Open haberman opened 3 years ago
@dominichamon Any thought on this? (I've similar question to @haberman as I'm thinking about what to do with the internal BenchmakMemoryUsage() )
this was already added some time ago to align with the internal usage. The above note mentions RegisterMemoryManager
and the idea is to use that to plug memory tracking in (because we didn't want to make assumptions about available memory management systems like tcmalloc).
outputting to json only was a matter of priority, not deliberately avoiding other outputs.
i'll leave this open to cover the addition of the memory reporting to CSV/Console reporters.
There is a section in User Guide that briefly mentions the supported functionality of benchmarking memory usage. But the documentation gives no concrete instruction of how to use it. Please provide details and examples.
The following is an example that I created to try the MemoryManager functionality. I'm interested to know how many bytes are allocated during a single benchmark execution.
I encountered a problem and I'm not sure if what I'm doing is right as there is no documentation.
Steps:
First create a subclass of benchmark::MemoryManager
and override both benchmark::MemoryManager::Start
and benchmark::MemoryManager::Stop
.
Second you need to wrap your memory operator or function such that you can store the size inside the MemoryManager object.
#include <memory>
#include <benchmark/benchmark.h>
class CustomMemoryManager: public benchmark::MemoryManager {
public:
int64_t num_allocs;
int64_t max_bytes_used;
void Start() BENCHMARK_OVERRIDE {
num_allocs = 0;
max_bytes_used = 0;
}
void Stop(Result* result) BENCHMARK_OVERRIDE {
result->num_allocs = num_allocs;
result->max_bytes_used = max_bytes_used;
}
};
std::unique_ptr<CustomMemoryManager> mm(new CustomMemoryManager());
#ifdef MEMORY_PROFILER
void *custom_malloc(size_t size) {
void *p = malloc(size);
mm.get()->num_allocs += 1;
mm.get()->max_bytes_used += size;
return p;
}
#define malloc(size) custom_malloc(size)
#endif
static void BM_memory(benchmark::State& state) {
for (auto _ : state)
for (int i =0; i < 10; i++) {
benchmark::DoNotOptimize((int *) malloc(10 * sizeof(int *)));
}
}
BENCHMARK(BM_memory)->Unit(benchmark::kMillisecond)->Iterations(17);
//BENCHMARK_MAIN();
int main(int argc, char** argv)
{
::benchmark::RegisterMemoryManager(mm.get());
::benchmark::Initialize(&argc, argv);
::benchmark::RunSpecifiedBenchmarks();
::benchmark::RegisterMemoryManager(nullptr);
}
To compile it run:
g++ src/memory_benchmark.cpp -std=c++11 -O3 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread -DMEMORY_PROFILER -o bin/benchmark_memory.exe
The you simple run:
./benchmark_memory.exe --benchmark_format=json
It is important to wrap the custom_malloc
function inside a conditional inclusion (e.g., #ifdef
), as collecting the memory could worsen the performance result. Therefore if you don't define MEMORY_PROFILER you will use the default malloc with stable results.
If I ran the benchmark with a single iteration I got expected result:
However from 17 or higher iterations max_bytes_used
is always 12800.
Can anyone tell me if what I'm doing is right? And if yes I hope this code can help other fellow developers!
apologies: closed as the above is resolved but the broader issue of reporting in other reporters isn't.
Can I look into this? This seems like a little similar to the issue we just closed. @dmah42
yes of course.
I use custom counters [ docs ] instead of benchmark::MemoryManager
[ src , docs ].
This enabled more detailed stats (using the m*map
and sbrk
hooks), and console tabular output without having to parse JSON:
./a.out --benchmark_counters_tabular=true
---------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations #new avg_new_B max_new_B min_new_B sum_new_B
---------------------------------------------------------------------------------------------------------------------
BM_demo/32/128/32 118 ns 117 ns 5931457 3 64 128 32 192
BM_demo/320/640/960 117 ns 117 ns 5985820 3 640 960 320 1.92k
This brief example is using tcmalloc. Adding it here in the hope it is useful to someone. It does not support running multithreaded benchmarks.
#include <cstring> // malloc
#include <gperftools/malloc_hook.h> // link tcmalloc
#include "benchmark/benchmark.h" // link benchmark
benchmark::IterationCount g_num_new = 0;
benchmark::IterationCount g_sum_size_new = 0;
benchmark::IterationCount g_max_size_new = 0;
benchmark::IterationCount g_min_size_new = -1;
auto new_hook = [](const void*, size_t size){ ++g_num_new; g_sum_size_new += size;
g_max_size_new = std::max(g_max_size_new, size);
g_min_size_new = std::min(g_min_size_new, size); };
#define BEFORE_TEST \
benchmark::IterationCount num_new = g_num_new;\
benchmark::IterationCount sum_size_new = g_sum_size_new;\
g_max_size_new = 0;\
g_min_size_new = -1;\
MallocHook::AddNewHook( new_hook );
#define AFTER_TEST \
MallocHook::RemoveNewHook( new_hook );\
auto iter = state.iterations();\
state.counters["#new"] = (g_num_new - num_new) / iter;\
state.counters["sum_new_B"] = (g_sum_size_new - sum_size_new) / iter;\
state.counters["avg_new_B"] = (g_sum_size_new - sum_size_new) / (g_num_new - num_new);\
state.counters["max_new_B"] = g_max_size_new;\
if( ((benchmark::IterationCount)-1) != g_min_size_new ){\
state.counters["min_new_B"] = g_min_size_new;\
}
static void BM_demo(benchmark::State& state) {
BEFORE_TEST
for (auto _ : state) {
void* ret1 = malloc(state.range(0));
void* ret2 = malloc(state.range(1));
void* ret3 = malloc(state.range(2));
free(ret1);
free(ret2);
free(ret3);
}
AFTER_TEST
}
BENCHMARK(BM_demo)->Args({32,128,32});
BENCHMARK(BM_demo)->Args({320,640,960});
BENCHMARK_MAIN();
this is great, thank you so much. the memory management API is in place largely for backwards compatibility with an older version of the library but this is a really neat way to bring more complete stats into the output.
thanks for sharing!
I am currently using google/benchmark to benchmark https://github.com/protocolbuffers/upb. I would very much like for those benchmarks to also include memory usage.
Does this project support memory benchmarking? The user guide does not mention memory usage benchmarking at all. However I notice that the header has a
RegisterMemoryManager()
function, and there is a unit test verifying this functionality.From running the test, it appears that the memory usage data is surfaced to JSON, but not CSV or console reports. Is this an oversight or intentional? Ideally this data would be shown on CSV and console reports also.