ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Roofline for INT32? #100

Open mrowan137 opened 1 year ago

mrowan137 commented 1 year ago

I used the Flask-based GUI to view roofline data for a kernel that is heavy on INT32 VALU arithmetic instructions, but has no FP16, FP32, FP64, or INT8 instructions. Because of this the plot marker for this kernel does not show up on the displayed Roofline plots. This can be disorienting, leaving the user to wonder why the kernel does not show up on the plots (but may only become clear when looking at the instruction mix further down). Is there any possibility or plan to extend the roofline plots to demonstrate performance of kernels heavy on INT32 arithmetic? Testing info:

coleramos425 commented 1 year ago

Thanks for reaching out @mrowan137. The reasoning behind our two Empirical Roofline plots is

  1. (FP32/FP64) for HPC applications
  2. (FP16/INT8) for ML application

My understanding is that these data types encapsulate a majority of the arithmetic for these two crowds. To justify adding this to our model, could you tell me a little more about your application and what group this would fit in?

mrowan137 commented 1 year ago

Hi @coleramos425 , the data I collected are from an application called ALE3D which we are supporting at LLNL as part of ELCAP bring-up. This would fall in the HPC category.