tmm1 / stackprof

a sampling call-stack profiler for ruby 2.2+
MIT License
2.1k stars 128 forks source link

Shrink flamegraph #97

Open der-flo opened 6 years ago

der-flo commented 6 years ago

I'm using Stackprof as part of the test-prof-Stack to examine our feature tests. Sampling the raw data for one RSpec example results in a 5 MB dump file. Creating the flamegraph file takes about 20 minutes. So far so good. But using the flamegraph viewer shows an enormous graph which I can not really examine seriously.

It is possible to reduce the entries? I'm e. g. not really interested in samples of ActionView, ActiveSupport, Bundler, ... . RubyProf offers some filtering mechanisms which are quite handy.

jlfwong commented 6 years ago

(Shameless self-plug) You should be able to use speedscope to examine stackprof profiles in an interactive way that should be more pleasant for massive profiles (https://github.com/jlfwong/speedscope#stackprof-ruby-profiler)

der-flo commented 6 years ago

@jlfwong I did a quick test. Selecting my 18 MB file gives me "Failed to load format. See console for details". The JS console gives me a quite interesting undefined in speedscope.86b357fe.js:144 😉.

@all: More solutions/ideas are very welcome.

jlfwong commented 6 years ago

@der-flo Interesting! Is it a file you'd be willing to share, or have a method of reproducing something that fails similarly? it should be able to handle 18MB files without too much issue.

Here's an example ruby file that I used to generate a test case that can import successfully into speedscope: https://github.com/jlfwong/speedscope/blob/master/sample/programs/ruby/simple.rb

der-flo commented 6 years ago

@jlfwong Um, my file contains quite sensitive information. I can not hand it out easily. To your Ruby example: To be honest, that's a hello world script. I'm profiling a Rails app - an enourmous difference. Your test obviously exports JSON, test-prof exports a binary dump file for Stackprof profiles.

jlfwong commented 6 years ago

@der-flo You're absolutely right! Based on a read of the test-prof sources, it seems like the file speedscope is failing to import is Marshal encoded. Aside from that though, I suspect it will import just fine.

To convert from the Marshal generated code to JSON for import, this might work:

ruby -e "require 'json'; puts JSON.generate(Marshal.load(File.binread('/path/to/your/stackprof.dump')))" > profile-for-speedscope.json

Then drop the resulting profile-for-speedscope.json into speedscope.

That's obviously not a great workflow. To support this better, I could either contribute a change to stackprof to make this integration tighter, or automatically convert Marshal data to JSON in speedscope.

der-flo commented 6 years ago

@jlfwong I converted two of the dump files and tried them:

Importing as stackprof profile
undefined                                 speedscope.57cb68ca.js:150
jlfwong commented 6 years ago

Ah, sorry. I'm not sure what's going on now.

Just to make sure, when running test-prof, are you using TEST_STACK_PROF=raw? (from https://test-prof.evilmartians.io/#/stack_prof)

I'm not sure what's wrong, and I'm not sure how to proceed at this point without having a file that fails to import. I'd really love to get this working since the whole point of speedscope is to be able to interactively examine really large flamegraphs. Thanks for sticking with me this far -- if you have other ideas of things to try to reproduce what you're running into, I'd love to fix this.

der-flo commented 6 years ago

@jlfwong My failure! I tried to analyse non-raw dumps. It now works. The UI is indeed quite nice!

My core problem unfortunately remains. I have enormous amounts of stack frames which I'm not interested in. They are dominating the view, preventing me to identify hot spots/problems.

@all RubyProf offers "method exclusion" while profiling and "method elimination" after profiling. With this tools I can get rid of all the ActiveSupport, Bundler, Rack, ActionView, ... stuff a usual Rails app pollutes the graph with.

Does anybody see a possibility to do such a reduction with Stackprof, speedscope, or an intermediate tool?

NickLaMuro commented 6 years ago

RubyProf offers "method exclusion" while profiling and "method elimination" after profiling.

@der-flo Not part of the core team on this, but it doesn't currently support this when doing flame graph generation. That said, there are options in place for '"method elimination" after profiling' when viewing via plain text:

https://github.com/tmm1/stackprof/blob/1af222023e21b546b43df6f465672c53ad6504dd/bin/stackprof#L26-L29

Again, not part of the flamegraph generation, and what is done for the plain text output does not even translate to the flamegraph generation since it uses the :raw bit from the profile data, and not the frame data (which is what gets filtered out by those options).

Those differ a quite a bit, so off the top of my head I couldn't tell you if filtering the raw data would even make sense. I would have to look into how the :raw is implemented.


@der-flo Alternatively, you might get better use out of the perl version of the "flamegraph", versus the web UI one. You can running it by doing:

$ stackprof --stackcollapse [DATA.stackprof] > data.stackcollapse
$ stackprof-flamegraph.pl data.stackcollapse > flamegraph.svg

You can also just use the original script from the repo, since the one packaged here is very old. Just used the flamegraph.pl file from that repo (just a single perl file).

I don't want make too much of a :book: to explain the difference, but the perl version that is packaged with stackprof doesn't display information chronologically, but grouped by stack location, and might be more useful when the hot spots are hit non-sequentially. This is "technically" a flamegraph, versus a "flamechart" (as I think Brendan Gregg describes it), which is mostly what speedscope and the current --flamegraph-viewer are.

Brendan has a good presentation on the subject over on his website, which does a better job of explaining it than I can if you are interested.

Hope this helps.

der-flo commented 6 years ago

@NickLaMuro Thanks for great amount of information. I'll have a look into it asap.

der-flo commented 6 years ago

@NickLaMuro I tested your suggestion using the original flamegraph script. The output is a bit more readable. But the core issue remains. Most frames are not interesting and pollute the UI.

NickLaMuro commented 6 years ago

@der-flo Sorry to hear that it wasn't much more help... 😞

I might take a look and see what would be involved with method elimination in post for flamegraphs, as I think that does seem like the preferred way of handling it on this project (adding more processing while taking the samples seems like it is against the original goals of the project). No promises though.