sibradzic / upp

A tool for parsing, dumping and modifying data in Radeon PowerPlay tables
GNU General Public License v3.0
155 stars 24 forks source link

RX 6600 only draws a maximum of 125W despite setting power limit to 150W #46

Closed Ovi05 closed 4 months ago

Ovi05 commented 4 months ago

Hello! I ran the following commands to increase my power limit and tdc limit to 150W and 115A respectively: upp set smc_pptable/SocketPowerLimitAc/0=150 --write upp set smc_pptable/TdcLimit/0=115 --write Checking with cat confirms that the power limit has been successfully set to 150W: cat /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap 150000000 However when I run FurMark 2 the GPU never goes above 125W and the clocks are limited to 2000-2100 MHz as a result.

Edit: I am running KDE Neon 22.04, with kernel 6.5.0-35-generic

sibradzic commented 4 months ago

Sorry, but how does this issue have anything to do with upp? I mean, upp sets the value, and you can confirm it being correctly set via hwmon, so what else is that you expect upp to do for you? If there is an issue here, it must be in the driver or in the firmware...

You may want to try to set the power limit explicitly using /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap and see how it behaves. Also, consider using amdgpu-clocks for power limiting your card the simple & easy way.

Ovi05 commented 4 months ago

For some reason, now the power limit can't be increased over 120W, giving a bash: echo: write error: Invalid argument when running echo 150000000 > /sys/class/hwmon/$(ls -1 /sys/class/drm/card0/device/hwmon)/power1_cap. If I check the logs it shows the following error: amdgpu 0000:06:00.0: amdgpu: New power limit (150) is over the max allowed 120. I have tried both older and newer kernels as well, to no avail. Any ideas what else I could try? Also, if you consider this isn't related to upp (which to be fair it kinda isn't), any suggestions for where else I could ask for feedback? Thanks!

sibradzic commented 4 months ago

Ok, what you are most likely hitting is amdgpu kernel driver power limit enforcement, which is a thing since few kernel releases ago. Check issues such as https://gitlab.freedesktop.org/drm/amd/-/issues/3277

Most likely your card VBIOS is limiting the card to 20% on top of default VBIOS settings. This is what the VBIOS limits are on my RX 6600 XT:

upp get smc_pptable/SocketPowerLimitAc/0
130
upp dump | grep POWERPERCENT
    max 8: 20 (POWERPERCENTAGE)
    min 8: 6 (POWERPERCENTAGE)

The outputs above indicate that the default upper socket power limit is 130W, and that you can go min 122W (130W - 6%) and max 156W (130W + 20%).

If your card default limit is 100W, and the POWERPERCENTAGE max in VBIOS is 20 (%), the driver in the newer kernels will not let you go over 120W (hence the kernel message you shared). The max limit is stored in hwmon power1_cap_max file, check it out.

Note that upp can be used to override these POWERPERCENTAGE max and min limits on runtime, so you should be able to do something like:

upp set --write /overdrive_table/max/8=35

which should cause your hwmon power1_cap_max to go to 135000000 (milliwatts, so 135W), and then you can override power1_cap to 135W using normal means (amdgpu-clocks recommended). I do not recommend going +50% over the VBIOS power limit, you may melt your GPU...

Ovi05 commented 4 months ago

I was in the middle of writing a comment about how this doesn't work when a thought hit me: "What if I tried a different benchmark?". With the new method you sent I get 145W of power draw in Unigine Superposition. Furmark 2 is still limited at 125W so that was a bit of a red herring. Kind of a dumb oversight on my part. Still, thanks a lot for all the help!