ROCm / rocprofiler

ROC profiler library. Profiling with perf-counters and derived metrics.
https://rocm.docs.amd.com/projects/rocprofiler/en/latest/
MIT License
126 stars 46 forks source link

gfx1030, metrics.xml, gfx_metrics.xml - doesn't contain descriptions. #94

Closed sysmanalex closed 1 week ago

sysmanalex commented 2 years ago

rocprof --list-derived Derived metrics: ERROR: rocprofiler_iterate_info(), ImportMetrics(), Bad metric 'L2CacheHit', var 'TCC_HIT[0]' is not found. rocm-5.1.2/rocprofiler/bin/rocprof amdgpu-install_22.10

sysmanalex commented 2 years ago

any updates here ? rocprofiler metrics definition & support for RDNA2 / GFX10/GFX11 ?

dmikushin commented 1 year ago

RX 6800 is also affected by the same error.

code-fool commented 1 year ago

6800 6900 6950 all have the same error

code-fool commented 1 year ago

the make mytest && run.sh can only access two PASSED

jevolk commented 1 year ago

I updated from ROCm 5.2 to 5.4 (by release-upgrading my Ubuntu 22.04 Jammy to 22.10 Kinetic) and it resolved the issue for me. Radeon V520 gfx1011 AWS G4ad.

sysmanalex commented 1 year ago

hmm will check it later... on amd mi60/mi100. p.s. without working opencl profiler it was too sad waste of time. I already sold 6800/6900. now testing similar nvidia/xilinx/intel for same opencl code/optimizations. Ubuntu 22/debian was not an issue, I was testing all drivers spectrum from amdgpu-pro-20.50 till 21.30,22.xx+ & even compile rocprofiler from sources github, too much effort.

Luke20000429 commented 1 year ago

Any updates here? I met the same issue running on RX6800XT, gfx1030. I am currently running ROCm5.2 on ubuntu18.04, will upgrade to 5.4 resolve the issue?

sysmanalex commented 1 year ago

AMD bug still not FIXED ! Bad metric 'L2CacheHit', var 'TCC_HIT[0]' is not found.

harkgill-amd commented 1 month ago

Hi @sysmanalex, sorry for the lack of response. I was not able to reproduce this issue with the ROCm 6.2 release. I tested on a W6800 (gfx1030) machine and also a MI100 (gfx908) machine.

There a few fixes that were introduced for this issue since this ticket was opened. Could you please confirm if you are still seeing this issue?

harkgill-amd commented 1 week ago

@sysmanalex, closing this ticket out for now. If you are still encountering this issue, please leave a comment and I will re-open this ticket. Thanks!