Closed onur-v closed 3 years ago
Hmmm, can you share the contents of a default /sys/class/drm/card0/device/pp_od_clk_voltage
?
What is your card and kernel version?
The card is Radeon VII, Ubuntu 20.04 kernel 5.4.0-47. The contents of /sys/class/drm/card0/device/pp_od_clk_voltage
is
OD_SCLK:
0: 808Mhz
1: 1801Mhz
OD_MCLK:
1: 1000Mhz
OD_VDDC_CURVE:
0: 808Mhz 716mV
1: 1304Mhz 799mV
2: 1801Mhz 1081mV
OD_RANGE:
SCLK: 808Mhz 2200Mhz
MCLK: 800Mhz 1200Mhz
VDDC_CURVE_SCLK[0]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[0]: 738mV 1218mV
VDDC_CURVE_SCLK[1]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[1]: 738mV 1218mV
VDDC_CURVE_SCLK[2]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[2]: 738mV 1218mV
I guess the problem is related to ROCm 3.8, others have reported the same issue. See https://github.com/RadeonOpenCompute/ROCm/issues/1228
OK, looks like the format of the pp_od_clk_voltage
changed (no more @
in the curve and so on). This being Radeon VII, what was the format before ROCm 3.8? Also, what's in your custom states file?
My custom states file was this:
OD_SCLK:
1: 1801Mhz
OD_MCLK:
1: 1100Mhz
OD_VDDC_CURVE:
0: 808Mhz @ 715mV
1: 1304Mhz @ 800mV
2: 1801Mhz @ 981mV
FORCE_POWER_CAP: 300000000
FORCE_PERF_LEVEL: manual
https://www.reddit.com/r/linux_gaming/comments/au7m3x/radeon_vii_on_linux_overclocking_undervolting/ this post from 1 year ago suggests that the format hasn't actually changed. The format that the poster shares in the link is identical to what I currently have.
I meant the format of the pp_od_clk_voltage
has been changed as set by the driver itself, not the custom state file... Since I don't have Radeon VII to check myself, please try to paste the output of /sys/class/drm/card0/device/pp_od_clk_voltage
, before ROCm 3.8, we need to compare these outputs...
I've just thad the chance to roll back to 3.7. This is the state of /sys/class/drm/card0/device/pp_od_clk_voltage
for ROCm 3.7, the last release where the script works.
OD_SCLK:
0: 808Mhz
1: 1801Mhz
OD_MCLK:
1: 1000Mhz
OD_VDDC_CURVE:
0: 808Mhz 716mV
1: 1304Mhz 800mV
2: 1801Mhz 1081mV
OD_RANGE:
SCLK: 808Mhz 2200Mhz
MCLK: 800Mhz 1200Mhz
VDDC_CURVE_SCLK[0]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[0]: 738mV 1218mV
VDDC_CURVE_SCLK[1]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[1]: 738mV 1218mV
VDDC_CURVE_SCLK[2]: 808Mhz 2200Mhz
VDDC_CURVE_VOLT[2]: 738mV 1218mV
It looks like AMD is re-hauling the OverDrive API with ROCm 3,8 and kernel 5.10, a tots of stuff is getting changed. I'll try to reproduce your issue on my RX5700 as soon as 5.10 rc is out.
This could be related to #18 Can you check if ROCm 3.8 driver contains the code along this patch in place?
Finally had time to test 5.10, and there is indeed an issue, as pp_od_clk_voltage output changed in the driver, yet again. @onur-v possible fix @ https://github.com/sibradzic/amdgpu-clocks/commit/5b48f04, please check.
The fix is working. Thanks a lot!
@onur-v thanks for reporting! have a good day!
The script was working perfectly until I updated to ROCm 3.8, hanging after the line
Committing custom states to /sys/class/drm/card0/device/pp_od_clk_voltage:
.