ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Request for more succinct L2-per-channel mode #104

Closed skyreflectedinmirrors closed 1 year ago

skyreflectedinmirrors commented 1 year ago

It's current a bit of a pain to analyze this. It would be far better (IMO) to have the default channel info be something like:

of each metric, over all 32 channels. The primary goal of these is to spot imbalances, and that's easier if you get aggregated stats, rather than a list of every value for all M metrics and N channels. Of course, having the current view as a "--detailed" mode or the like would be nice as well

feizheng10 commented 1 year ago

Alternatively, we might design like this: (1) single metric for speed of light: use std.dev on top of avg per channel to indict "how imbalance" overall(any other better choice?). (2) box plot per channel shows min/max/mean/median clearly.

skyreflectedinmirrors commented 1 year ago

I think std. dev. (as a percent of the average) would be a good choice in the SOL

coleramos425 commented 1 year ago

Added an "Aggregate Stats" table to the top of L2 Cache (per-channel) section. The table contains mean, std dev, min, max. image

coleramos425 commented 1 year ago

Feature complete. Feel free to try things out in our dev branch ahead of the next release.