ilya-zlobintsev / LACT

Linux GPU Configuration Tool
MIT License
1.46k stars 35 forks source link

Clockspeed / voltages are set but not applied. (7900XTX) #337

Closed geludwig closed 5 months ago

geludwig commented 6 months ago

Checklist

Bug description

Clockspeeds or voltages are not applied despite set in GUI (and checked in config files). Kernel parameter amdgpu.ppfeaturemask=0xffffffff is also set in systemd-boot. Unsure if I missed something here.

Despite this, power limit is applied correctly up to 350W. Over 350W I use this shell script to bypass the known limitation:

#! /bin/bash
sudo su <<EOF
echo 402000000 > /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap

In the following example I tried to lower the clockspeeds to 2000MHz, but the sliders dont have any effect (neither clockspeed or voltage).

clocks_not_working

clock_fix_not_working

LACT-sysfs-snapshot-20240529-213154.tar.gz

System info

- LACT version: 0.5.4-release (commit 6dc3520)
- GPU model: XFX SPEEDSTER MERC 310 AMD Radeon RX 7900 XTX Black Edition
- Kernel version: 6.8.0-76060800daily20240311-generic
- Distribution: Pop!_OS 22.04 LTS
geludwig commented 5 months ago

Tested some more and it seems you can not change clockspeeds and voltages AND also raise the power limit. For the moment you have to decide for one or the other.

ilya-zlobintsev commented 5 months ago

Does it work together with lowering the power limit? I want to know if it's an issue with how settings are applied, or just with how the GPU interprets them.

geludwig commented 5 months ago

It is only a problem if the power limit is raised above the official (?) power limit of around 350W.

In the following example I set the clockspeed to 2000MHz. The power limit in the config is set to 402W, but the real power limit is 350W. After that, I set the clockspeed to 3000MHz and the power limit hits 350W, despite being set to 402W in the config.

I could raise the power limit to the real 402W by echo 402000000 > /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap as sudo su, but the clockspeed and voltage settings would then be ignored.

Screenshot from 2024-06-01 00-15-00

Changing clockspeed, voltages and power limit as superuser in one go does not solve it or even brick the system completely (stuck power limit at 280W, stuck clockspeeds etc.). So likely kernel issue.

#! /bin/bash
sudo su <<EOF
echo 402000000 > /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap
echo "s 1 3000" > /sys/class/drm/card1/device/pp_od_clk_voltage
echo "m 1 1250" > /sys/class/drm/card1/device/pp_od_clk_voltage
echo "vo -50" > /sys/class/drm/card1/device/pp_od_clk_voltage