jlfwong / speedscope

🔬 A fast, interactive web-based viewer for performance profiles.
https://www.speedscope.app
MIT License
5.52k stars 241 forks source link

Aggregation in default "time-order" view #195

Closed aparamon closed 5 years ago

aparamon commented 5 years ago

Currently, the default "time-order" view performs aggregation differently compared to "left-heavy" view: resulting "time-order" display is similar to flame charts as opposed to original flame graphs. It would be useful to have an additional view where aggregation is performed the same as for "left-heavy", just sub-items on a plateau are sorted in the order of first occurrence.

jlfwong commented 5 years ago

Hi @aparamon,

I'm not really sure what you're looking for. Can you provide an example?

What I think you're asking for is something like left-heavy, but instead of sorting the heaviest to the left, sort the first-occurring to the left. Is that right?

If that is right, can you explain your use case where that would be really handy when neither the time order or left heavy views are meeting your needs?

aparamon commented 5 years ago

@jlfwong

What I think you're asking for is something like left-heavy, but instead of sorting the heaviest to the left, sort the first-occurring to the left. Is that right?

Correct. Sorted this way, the sequence of items resembles the flow of the program more closely, usually. For example, if I invoke "put" then "get" inside a subroutine, I expect to "put" to come before "get" in the flame graph even if takes less time to execute.

When trying to create a demo example, I discovered that FlameGraph's stackcollapse*.pl scripts seem to already ruin the order of recorded stacks. I'll try to investigate this in more details in the following weeks.

Thank you for your patience!

jlfwong commented 5 years ago

When I look at profiles, if I'm interested in understanding control flow, I look at the time ordered view. If I'm interested in understanding where the majority of the time is going over many calls, I look at left heavy view. In what circumstances would you want to look at an only partially time-ordered view with aggregation?

aparamon commented 5 years ago

@jlfwong When the same code path is executed multiple times in a loop. One example is intensive calculation algorithm, and another is server processing many similar requests.

Aggregation is crucial in both cases, because a single iteration/request is likely to be very short/non-representative. Aggregation method preserving the recording order of stack items has the benefit of keeping the time order inside an iteration. The method seems easy to implement, given the recording order survived stackcollapse*.pl step.

jlfwong commented 5 years ago

Gotcha, thanks for explaining the use case!

It indeed wouldn't be hard to implement, but I'm not convinced that the added utility is worth adding UI or hidden shortcuts to support the use case. When debugging performance scenarios like you describe, I would first look at the left-heavy view to see where the time is going, then look at the time-ordered view to understand the control flow. It's a bit more work to do the cross referencing, but I also don't find myself needing to understand both order and weight simultaneously very frequently. Once I identify what's slow, I typically just go read the code to see what the ordering is.

Even if we preserved the time-ordering in aggregation, there would still be cases where it would be misleading. I suspect you've already thought of these, but for the sake of discussion, let's talk about a few common cases where this might happen:

Let's say you have the following code:

for (let i = 0; i < 100000; i++) {
  if (I > 0) alpha();
  beta();
  gamma();
}

In this situation, assuming perfect sampling, the time aggregated view would display beta, then gamma, then alpha. even though in the vast majority of loops, beta and gamma follow alpha.

Also, if you have the same function called multiple times, it will be misleading too:

for (let i = 0; i < 100000; i++) {
  alpha();
  beta();
  alpha();
}

So given that it won't always be accurate in its aggregations, I think the number of scenarios in which I'd find this useful is pretty small.

If you're interested in doing this in your own fork, then please feel free! The relevant code, I believe, would be here: https://github.com/jlfwong/speedscope/blob/master/src/lib/profile.ts#L184. It might even do what you want it to if you just delete that line.

aparamon commented 5 years ago

@jlfwong Thank you for your support! Indeed, I can imagine a lot of cases where this "natural" order will be misleading. How big this problem is in practice is yet to be seen. Thank you for the pointer to the relevant code line! I'll do some experiments and report the results in 2019. Happy New Year ;-)

jlfwong commented 5 years ago

@aparamon If you do stumble upon use cases that this reveals really helpful insights that weren't otherwise evident please feel free to post here!

I'm going to close this for now since I don't intend to take action based on the information we have so far. I'll reopen if some new evidence suggests otherwise 🙂

aparamon commented 5 years ago

@jlfwong Please take a look: https://github.com/benfred/py-spy/issues/84

jlfwong commented 5 years ago

@aparamon Sweet! Does this mean there's now an import path for importing python profiles into speedscope? Do you have repro steps for me to try this out?

aparamon commented 5 years ago

Yes, with https://github.com/benfred/py-spy/pull/74 and --stacks=file.txt command-line argument. Hopefully my PR gets merged soon!