m4drat / memplusplus

C++ memory allocator with smart GC
54 stars 4 forks source link

Performance comparisons (mimalloc, malloc, jemalloc) + cpu/memory Profiling #29

Closed m4drat closed 2 years ago

m4drat commented 4 years ago

Specifically measure: Using valgrind to measure cache misses

Implement performance comparison with these memory allocators:

Useful links:

  1. rpmalloc-benchmark
  2. how-can-i-profile-c-code-running-on-linux
  3. easy_profiler
  4. prof
  5. VISUAL BENCHMARKING in C++ (how to measure performance visually)
  6. Google benchmark
  7. "Performance Matters" by Emery Berger
  8. Coz: Finding Code that Counts with Causal Profiling
  9. P2329-move_at_scale
  10. CppCon 2015: Chandler Carruth "Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My!"
  11. Intel v-tune
  12. github-gprof2dot
  13. How to benchamrk correctly: llvm

Don't forget to use clang stabilizer!

Add PROFILING definition and option to build

#!/bin/bash

# build the program (no special flags are needed)
g++ -std=c++11 cpuload.cpp -o cpuload

# run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind
valgrind --tool=callgrind ./cpuload

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind
#!/bin/bash

# build the program; For our demo program, we specify -DWITHGPERFTOOLS to enable the gperftools specific #ifdefs
g++ -std=c++11 -DWITHGPERFTOOLS -lprofiler -g ../cpuload.cpp -o cpuload

# run the program; generates the profiling data file (profile.log in our example)
./cpuload

# convert profile.log to callgrind compatible format
pprof --callgrind ./cpuload profile.log > profile.callgrind

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind
m4drat commented 2 years ago

What we should test:

  1. Some project with heavy usage of data structures, and where it is possible to use custom allocator interface
  2. Allocation speed, deallocation speed, gc-speed (if presented)