ROCm / rocprofiler-compute

Advanced Profiling and Analytics for AMD Hardware
https://rocm.docs.amd.com/projects/omniperf/en/latest/
MIT License
135 stars 49 forks source link

max_mclk --showmclkrange error Mi100 #254

Closed JoseSantosAMD closed 8 months ago

JoseSantosAMD commented 8 months ago

Describe the bug After testing latest 2.x I found that max_mclk is populated as None. Leading to errors in later Omniperf profiling pipeline. Seems to be coming from #243

https://github.com/AMDResearch/omniperf/blob/852cc13f2a8057710f456bdbe567643831fc701d/src/utils/specs.py#L136-L138

I tested --showmclkrange on several MI100 systems and it appears that this value will always be None.

Development Environment:

To Reproduce Steps to reproduce the behavior:

  1. Run omniperf profile -n vcopy --./tests/vcopy 1048576 256

Screenshots MicrosoftTeams-image (4)

coleramos425 commented 8 months ago

Thanks Jose. I was able to reproduce this error as well. Adding @skyreflectedinmirrors who may know more about why we're seeing this.

skyreflectedinmirrors commented 8 months ago

Sigh, yeah -- looking into this it seems like a limitation of SMI on MI100. I'll code in a workaround.

skyreflectedinmirrors commented 8 months ago

@JoseSantosAMD -- can you try: https://github.com/AMDResearch/omniperf/pull/256

koomie commented 8 months ago

Resolved now in CI via #256. Thanks.