sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Profiling encoding & decoding #35

Open sunchao opened 6 years ago

sunchao commented 6 years ago

It may be useful to do some profiling on encoding & decoding. We can use existing bench for this. Some useful scripts:

For CPU:

perf record -g -F 1000 <bench-name> --bench && perf script | stackcollapse-perf | flamegraph > flamegraph.svg

The stackcollapse-perf and flamegraph are from flamegraph.

For cache performance (from this thread):

valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes --simulate-cache=yes <bench-name>

I also found that we may need to add the following section in Cargo.toml:

[profile.bench]
debug = true
opt-level = 1

If opt-level is too high, rustc will inline most of the functions and therefore it's hard to see which function is the bottleneck. Not sure if there's better way to do this.