ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Omniperf roofline does not display FP64 / General Roofline Datatype Support #207

Open ausellis0 opened 11 months ago

ausellis0 commented 11 months ago

Is your feature request related to a problem? Please describe.

The Omniperf roofline plots seem to collate fp32 and fp64 data types, but labels read as FP32 rates only leading to user confusion.

Describe the solution you'd like

It would help if these were separated to help with immediate interpretability or at least have the option for users. Further, it would be nice to have a --roofline-datatype=${DTYPE} option to isolate any specific datatype ops to its own roofline plot (e.g. FP64 / FP32 / INT32 / FP16 / BF16 / INT8 / INT4 / etc...). The lower precision datatypes would be useful for profiling future machine learning inference cases.