brendangregg / FlameGraph

Stack trace visualizer
http://www.brendangregg.com/flamegraphs.html
17.08k stars 1.95k forks source link

Reduce memory usage #58

Open vincele opened 9 years ago

vincele commented 9 years ago

I'm trying to use FG for the first time, and it gets OOM-killed... I've got 4GB ram + 2GB swap and this is apparently not sufficient :-/

$ dmesg
[...]
[25783.912839] Out of memory: Kill process 26662 (stackcollapse-p) score 788 or sacrifice child
[25783.912844] Killed process 26072 (stackcollapse-p) total-vm:4768696kB, anon-rss:3372340kB, file-rss:0kB
$ free
              total        used        free      shared  buff/cache   available
Mem:        4035936      634612     3239928       39348      161396     3312444
Swap:       2000088      672080     1328008
$ ls -lh perf.data
-rw------- 1 vince vince 107M Jun 27 13:27 perf.data

Is that really too much perf data to process or is there something that leaks memory in FG ?

vincele commented 9 years ago

I managed to get past the OOM killing by reducing data quantity : filtering on the only process that interest me...

perf script -c ${PROCESS_NAME} | FlameGraph/stackcollapse-perf.pl > out.perf-folded

But still, processing 100MB of traces into 300MB of folded data should consume less than 4GB, if possible...

brendangregg commented 8 years ago

Yes, stackcollapse-perf.pl is inefficient. Fortunately that step should be disappearing in the future, as Linux perf will have the ability to emit the folded output directly, and do so with much less memory footprint. Currently being discussed on lkml "[PATCHSET 0/4] perf report: Support folded callchain output (v4)"