brendangregg / FlameGraph

Stack trace visualizer
http://www.brendangregg.com/flamegraphs.html
17.29k stars 1.96k forks source link

The delta percentage numbers in the differential flame graphs look confusing. #44

Open agentzh opened 9 years ago

agentzh commented 9 years ago

Hi Brendan,

I'm very excited about the new differential flame graphs, especially the negated ones for assessing optimization results. Thank you for doing it! But I've found the delta percentage numbers rather confusing and are not what I'd expect.

For example, if a specific tower in the "before" flame graph disappears completely in the "after" graph, I'd expect the function frames in the negated graph show deltas like "-100%" but I'm getting ridiculously small numbers like "-0.8%" due to the way this delta percentage is calculated (which IMHO does not make much sense).

Please consider the following minimal example:

main;foo 1 2
main;foo;bar 0 4

The resulting negated diff flame graph is shown here: http://agentzh.org/misc/flamegraph/tiny-negated-diff.svg

For these diff folded backtraces, the total reduction in foo()'s time should be (1-(4+2))/(4+2) = -0.83 (or -83%), but in the SVG, the "foo" frame shows -16.67%. On the other hand, the bar function should show (0-4)/4 = -1 (or -100%), but the graph actually shows -66.67% at the bar frame. These numbers are quite far from my intuition.

I can understand that the current calculation is the delta percentage regarding the whole sample space but for optimizations, I only concern about how radically a particular tower changes even though its overall ratio is not that big. And even for the whole sample space ratio calculation approach, the current results still do not look right ;)

What's your opinion on this?

agentzh commented 9 years ago

Hmm, sorry, it seems that the most intuitive calculation should be more complicated than what I previously proposed. For that tiny example, maybe we should also "scale" the "after" result because it has way fewer samples in total and direct comparison between the original sample counts is like comparing oranges to apples :)

I think we can first scale

main;foo 1 2
main;foo;bar 0 4

into

main;foo 6 2
main;foo;bar 0 4

such that "before" and "after" results now have equal number of total samples and then perform the calculations:

foo: (6+0-(2+4))/(2+4) == 0 (0%)
bar: ((0-4)/4) == -1 (-100%)

These numbers look much more reasonable to me :)

feng-tao commented 8 years ago

Is there any update on @agentzh's request? It seems the interpretation from @agentzh sounds more reasonable to me.