jlfwong / speedscope

🔬 A fast, interactive web-based viewer for performance profiles.
https://www.speedscope.app
MIT License
5.59k stars 246 forks source link

File format: unit=bytes makes startValue/endValue meaningless #406

Open vasi-stripe opened 2 years ago

vasi-stripe commented 2 years ago

When unit=milliseconds, it's clear that it applies to both timestamps (like startValue) and to sample weights.

But when unit=bytes, there's no way to know what startValue/endValue are supposed to mean.

We should consider separate timeUnit and weightUnit fields.

jlfwong commented 2 years ago

Hi @vasi-stripe!

I agree -- the startValue and endValue fields are weird to interpret in the case of unit=bytes. However, in that case, even if there was a separate timeUnit and weighUnit field, the startValue/endValue would still be meaningless (or at least of no value for display in speedscope as far as I can tell).

Am I understanding you correctly, or are you envisioning situations where you'd specify both timeUnit and weightUnit and they'd both be relevant for correctly interpreting a profile?

vasi-stripe commented 2 years ago

Good questions! For display in Speedscope as it currently exists, I agree there's not a real benefit to having meaningful startValue/endValue.

But for use of the Speedscope format as a generalized interchange format, and/or for future extensions to Speedscope, it makes sense to think about time and weight differently. A few things one could do with this data:

  1. I'm currently profiling our services continuously, and using tools like Pyroscope allow me to select a profile from a timeline. That only really works if profiles have timestamps!

    image
  2. Some tools allow aggregating/post-processing profiles to answer questions like "is method M doing more allocations now than it used to?" To do that, the tool would need a startTime to identify when the profile happened, and an endTime to be able to properly weight allocations-per-time-spent-profiling.

  3. For many types of profile, wall-time and weight are not one-to-one. For example, if we're taking a CPU profile, sometimes we're using a CPU, and other times we're just waiting on I/O. The Speedscope format doesn't really allow representing this well, and currently Speedscope would visualize this by just ignoring the non-CPU time. But you could imagine displaying weights over a time axis instead, what Brendan Gregg calls a "Flame chart", and that's only possible if we have time units separately from weights.

Anyhow I'm not sure what the overall takeaway is here, just that this is something we can keep in mind as we think about the future of this file format!