ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

Refactor Specs + Mi300 Enhancements #273

Closed coleramos425 closed 8 months ago

coleramos425 commented 8 months ago

This PR refactors the specs.py module by cleaning up code, reducing overhead, and improving extensibility.

Out of convenience this PR also layers on #247, #251, #252, and #255 (thanks @skyreflectedinmirrors).

2/27 TODO:

coleramos425 commented 8 months ago

@skyreflectedinmirrors I made some progress on L2 Cache per channel panel and the GUI filtering is now fixed. You should be good to resume verification whenever you're ready.

skyreflectedinmirrors commented 8 months ago

Opened another PR (https://github.com/AMDResearch/omniperf/pull/280) with a few touchups, will have the specs description done today.

skyreflectedinmirrors commented 8 months ago

RE:

I'd like to remove this hardcoded reference to gfx90a which adds roofline to ip_blocks in sysinfo. This is in the spirit of us moving all this SoC material to subclasses. Not required, but nice to have

I think the easiest way is to put a flag on whether rooflines are allowed in each SOC class (if it doesn't exist already), then pass the SOC in from the profiler: https://github.com/AMDResearch/omniperf/blob/2d92bcff2e3de9e15ad2e0abf306b5d350143ae8/src/omniperf_profile/profiler_base.py#L352

We don't technically have access to the SOC from the roofline callsite (https://github.com/AMDResearch/omniperf/blob/2d92bcff2e3de9e15ad2e0abf306b5d350143ae8/src/roofline.py#L343), but I'd assume roofonly is true here?

coleramos425 commented 8 months ago

I think the easiest way is to put a flag on whether rooflines are allowed in each SOC class (if it doesn't exist already), then pass the SOC in from the profiler:

Done. Thanks for the suggestion, Nick :)