grafana / pyroscope

Continuous Profiling Platform. Debug performance issues down to a single line of code
https://grafana.com/oss/pyroscope/
GNU Affero General Public License v3.0
9.85k stars 586 forks source link

Profile entries #3016

Open kolesnikovae opened 6 months ago

kolesnikovae commented 6 months ago

I propose introducing a feature that allows users to access individual profiles rather than relying solely on their aggregated representation.

Several users, including myself, have encountered difficulties obtaining a single profile for a specific profiling target at a particular time, especially in scenarios like OOM debugging.

To address this:

This feature is supposed to accommodate users who are accustomed to working with "individual" profiles rather than aggregated data.

This feature is expected to complement the heatmap and time range selector proposals, providing users with a more efficient way to pinpoint and analyze specific events within the profiling data.

martijnthe commented 5 months ago

@kolesnikovae does this also imply that the x-axis in the flamegraph could/would be "time" instead of "samples"? If so, I'd love to see this feature.

Our use case: alongside the boons of distributed profiling, it's very useful to "drill down" into a single profile to get a better understanding of a performance issue. In particular in cases where the same stack traces can appear multiple times and the ordering of calls is key to understanding & debugging an issue.

Right now it seems like the ordering within the flamegraph to be somewhat arbitrary. Is there documentation somewhere on how the aggregation (and visualization thereof) works?

kolesnikovae commented 5 months ago

@kolesnikovae does this also imply that the x-axis in the flamegraph could/would be "time" instead of "samples"? If so, I'd love to see this feature.

No, not quite. Currently, Pyroscope is most suited for statistical profiling. In such a profile, we do not have any information on the order of calls or when exactly a function was called. Instead, it shows aggregated measurements for sampled stack traces: for a CPU profile, this typically involves counting the number of times the stack trace was observed executing on the CPU. This allows us to visualize this information as a flame graph – I'll quote the author of the idea:

The x-axis shows the stack profile population, sorted alphabetically (it is not the passage of time), and the y-axis shows stack depth, counting from zero at the bottom. Each rectangle represents a stack frame. The wider a frame is is, the more often it was present in the stacks. The top edge shows what is on-CPU, and beneath it is its ancestry. Original flame graphs use random colors to help visually differentiate adjacent frames. Variations include inverting the y-axis (an "icicle graph"), changing the hue to indicate code type, and using a color spectrum to convey an additional dimension.

What you're describing is another kind of profile – a so-called event-based profile or an execution trace. This is a vastly different type of signal, with a completely different data model and analysis methods.

As of now, Pyroscope does not support profiles of this type. The main reason is that it's currently impractical to continuously profile programs in this fashion: the amount of data this produces and the footprint on the profiled program are the biggest problems. Therefore, profiling of this type is typically performed on an ad-hoc basis

martijnthe commented 5 months ago

@kolesnikovae thanks so much for your detailed answer!