Closed hsadasiv closed 8 months ago
Thanks for reporting it. However, it might be a complicated question to answer: It all depends on the vbios version and rocm-smi version... I believe omniperf only retrieves cur_sclk from rocm-smi. On gfx90a, with latest vbios, you might not be able to set/fix the sclk. You might try: rocm-smi --setperflevel high
Alternatively, omniperf should allow having manual spec/config in.
Thank You for your response. Does cur_sclk mean current system clock? if so, should Omniperf measure the clock when it goes high to 1700MHz? I can see rocm-smi catching it as soon as it goes up.
I kinda agree with this. I think ideally for a lot of the PoP metrics, we should be actively sampling the clock during execution of the kernels. (I mean, really ideally, the profiler would spit this sorta info out for us, or had clock locking like ... other profilers). But, we could fairly easily spin up a background process that samples clock rates over the lifetime of an app and reports back an average once all runs are completed?
Right now it looks like we run it on a cold GPU when gathering the specs?
which is bound to give interesting answers if e.g., an app kicks the clock up significantly then spams FLOPS.
Aha, I'm wrong, we take $sclk
from here:
i.e., not cur_sclk
, which ends up being pulled from the max sclk here:
from rocminfo.
This is ... better at least, because it's some theoretical maximum value of FLOPs and the like you could achieve, it just doesn't take the achieved clock for your kernel into account (I think we might actually be able to do this via GRBM_GUI_ACTIVE... maybe).
This suggests that if we ever do try to change from sclk
to cur_sclk
, we're in for a lot of reports of "PoP exceeds 100%" tho, cc: @coleramos425
This is related to issue #245. Closing ticket and linking PR #246 which closes the issue.
Hello,
seems on gfx90a, cur_sclk on the web omniperf report is 800MHz whereas I can see sclk going to 1700MHz using rocm-smi (only when the program runs). PS: I did force the clk using "rocm-smi -d 0 --setperfdeterminism 1700" and made sure to run the process on gpuid: 0