ROCm / omnitrace

Omnitrace: Application Profiling, Tracing, and Analysis
https://rocm.docs.amd.com/projects/omnitrace/en/latest/
MIT License
297 stars 27 forks source link

Percentiles and other statistics besides mean, min, max for flat profiles #278

Closed pelahi closed 1 week ago

pelahi commented 1 year ago

Hello, The flat profile for omnitrace, like all other profilers, is quite useful but can produce misleading statistics if the underlying distribution of what is being sampled is not a simple Gaussian distribution. An example would be a heavily skewed distribution where the mean and standard deviation can be misleading. A better summary would be to use percentiles, say min, 1%, 16%, 50%, 86%, 99%, max. Could such summary statistics be implemented? Even better, would it be possible to define the percentiles as a omnitrace configuration option and even provide the desired percentiles?

This feature would be very useful given lots of processes have log-normal distributions or skewed distributions.

jrmadsen commented 1 year ago

Hi @pelahi, currently the flat profiles are handled by timemory and the min/max/stddev is stored in a constant size data structure that does not require dynamic allocations: statistics.cpp.

In order to provide the requested functionality, it would have to effectively store every measurement because we wouldn't be able to bin the values until the min/max was established. Another way of explaining it is that we would need collect the full trace before condensing it into a flat profile during post-processing instead of condensing it into a flat profile on the fly. And that would significantly bloat the memory usage which is a non-starter since tracing at scale on supercomputers generates far too much data. So unfortunately, we will not be able to accommodate this request directly.

However, I've just finished work on integrating support for the perfetto trace format into a Python package called hatchet in https://github.com/hatchet/hatchet/pull/501. This package will effectively translate the perfetto traces into a pandas dataframe and allow you to manipulate it into a flat profile and bin the measurements into however many percentiles you desire. Ideally, this capability could even be part of a omnitrace-analysis Python package installable via PyPi with a console script omnitrace-analyze which command-line-izes this functionality.

side note: hatchet already supports the JSON profiles generated by timemory.

jrmadsen commented 1 year ago

As you can see in the hatchet docs for generating a flat profile, it is very straightforward to convert the trace to a flat profile and pandas has built-in capabilities for calculating the percentiles/quantiles (tutorial)