Closed Ovi05 closed 4 months ago
Sorry, but how does this issue have anything to do with upp? I mean, upp sets the value, and you can confirm it being correctly set via hwmon, so what else is that you expect upp to do for you? If there is an issue here, it must be in the driver or in the firmware...
You may want to try to set the power limit explicitly using /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap
and see how it behaves. Also, consider using amdgpu-clocks for power limiting your card the simple & easy way.
For some reason, now the power limit can't be increased over 120W, giving a bash: echo: write error: Invalid argument
when running echo 150000000 > /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap
. If I check the logs it shows the following error: amdgpu 0000:06:00.0: amdgpu: New power limit (150) is over the max allowed 120
. I have tried both older and newer kernels as well, to no avail. Any ideas what else I could try? Also, if you consider this isn't related to upp (which to be fair it kinda isn't), any suggestions for where else I could ask for feedback? Thanks!
Ok, what you are most likely hitting is amdgpu
kernel driver power limit enforcement, which is a thing since few kernel releases ago. Check issues such as https://gitlab.freedesktop.org/drm/amd/-/issues/3277
Most likely your card VBIOS is limiting the card to 20% on top of default VBIOS settings. This is what the VBIOS limits are on my RX 6600 XT:
upp get smc_pptable/SocketPowerLimitAc/0
130
upp dump | grep POWERPERCENT
max 8: 20 (POWERPERCENTAGE)
min 8: 6 (POWERPERCENTAGE)
The outputs above indicate that the default upper socket power limit is 130W, and that you can go min 122W (130W - 6%) and max 156W (130W + 20%).
If your card default limit is 100W, and the POWERPERCENTAGE max in VBIOS is 20 (%), the driver in the newer kernels will not let you go over 120W (hence the kernel message you shared). The max limit is stored in hwmon power1_cap_max
file, check it out.
Note that upp can be used to override these POWERPERCENTAGE max and min limits on runtime, so you should be able to do something like:
upp set --write /overdrive_table/max/8=35
which should cause your hwmon power1_cap_max
to go to 135000000 (milliwatts, so 135W), and then you can override power1_cap
to 135W using normal means (amdgpu-clocks recommended). I do not recommend going +50% over the VBIOS power limit, you may melt your GPU...
I was in the middle of writing a comment about how this doesn't work when a thought hit me: "What if I tried a different benchmark?". With the new method you sent I get 145W of power draw in Unigine Superposition. Furmark 2 is still limited at 125W so that was a bit of a red herring. Kind of a dumb oversight on my part. Still, thanks a lot for all the help!
Hello! I ran the following commands to increase my power limit and tdc limit to 150W and 115A respectively:
upp set smc_pptable/SocketPowerLimitAc/0=150 --write
upp set smc_pptable/TdcLimit/0=115 --write
Checking with cat confirms that the power limit has been successfully set to 150W:cat /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap
150000000
However when I run FurMark 2 the GPU never goes above 125W and the clocks are limited to 2000-2100 MHz as a result.Edit: I am running KDE Neon 22.04, with kernel 6.5.0-35-generic