BoukeHaarsma23 / WattmanGTK

A Wattman-like GTK3+ GUI
GNU General Public License v2.0
374 stars 61 forks source link

default core clock limit hard wall #8

Open GloriousEggroll opened 5 years ago

GloriousEggroll commented 5 years ago

hi, i was hitting an issue with my vega 64 with hitting a 1630 default soft wall limit on the core. I found via the arch wiki: https://wiki.archlinux.org/index.php/AMDGPU#Overclocking you can increase the limit. I set echo "10" > /sys/class/drm/card0/device/pp_sclk_od to allow a 10% max increase from default, and now i am able to get past the soft wall

Haxk20 commented 5 years ago

This should be reported to AMDGPU buglist as im experiencing the same issue. But on RX560X. Sadly i have to set high performance mode on my card because i cant get past the soft wall on memory clocks. Sadly that way my card runs 100% all the time while using it for gaming.

BoukeHaarsma23 commented 5 years ago

What do you mean by "soft wall"? The problem is a bit unclear to me

Haxk20 commented 5 years ago

What do you mean by "soft wall"? The problem is a bit unclear to me

GPU clocks are at the lowest state until you "overclock" the GPU then it starts to change states depending on load. Without overclocking it or chaging performance state its stuck at lowest clock state.

But still this isnt bug of this program. This is bug in AMDGPU driver most likely.

GloriousEggroll commented 5 years ago

the vega 64 is able to clock much higher than 1630, but by default on linux it's limited to 1630 no matter what you set unless you increase the max limit first as mentioned above

@Haxk20 my gpu goes through all the clock states properly. the max advertised boost on my card is 1590. the problem is it can normally go much higher than that. I was hitting a wall at 1630. Once i tried to set any state higher than 1630, it would default to the previous state. Once I added the fix mentioned above, it allowed clocking higher than 1630.

To put this in perspective. some people have managed to clock their vega 64s on average 1645-1680 safely, and I had done this on windows.

This is not an amdgpu bug because the states are working properly. It's just the default max clock limit that doesn't let you go past 1630 until you add the line I mentioned from the arch wiki. The % amount is adjustable, does not need to be 10. I just used 10 for plenty of headroom for overclocking.

here's what my modified bash script looks like with the limit raised and fan speed set:

#!/bin/bash
echo "10" > /sys/class/drm/card0/device/pp_sclk_od
echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level
echo 260000000 > /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
echo "s 0 877 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 1 1020 900" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 2 1200 925" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 3 1300 935" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 4 1450 940" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 5 1590 970" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 6 1620 1070" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 7 1660 1100" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 0 167 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 1 500 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 2 800 950" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 3 1000 960" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo 1  > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_enable
echo 200 > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1
Haxk20 commented 5 years ago

Weird because when i do your "fix" then my states start working properly. But this really isnt the proper solution as the driver should switch states normally without the need of me switching the first state. My solution to this was to set high to performance state and this will set the highest state on Core and memory. But without this the auto performance level sets the clocks to lowest state. Only when i do your fix then my Core clocks start to go to other states. But memory clocks are still stuck. I would file a bug but tbh i dont know where.

BoukeHaarsma23 commented 5 years ago

I was able to reproduce this. However, since I will not write any overclock without the user knowledge I updated the FAQ 7bae16e589895437a943fe89cb57f447b415e40e and will close this.

GloriousEggroll commented 5 years ago

the option doesn't write an overclock, it only increases the amount the card is able to extend to. it does not change any of the card's default clock speeds

example:
my p7 default clock speed+voltage is 1590/1200. Should I choose to overclock, i can increase it up to 1630 by default without increasing any upper limit.

If I add the option specified, I can overclock further than that, but only if I manually change the states.

I can set the option and still not touch the states and the speeds remain at default for all states.

If you do nothing but execute that option, the max clock for p7 will still remain at the 1590 default unless manually set by the user. All it does is allow it to go past 1630 -if- manually set.

It's like putting my size 11 foot into a size 12 shoe vs size 13 shoe. my foot stays the same size unless I somehow magically grow it myself.

BoukeHaarsma23 commented 5 years ago

Then I think we misunderstand each other. Or I don't understand everything properly.

Does

echo "10" > /sys/class/drm/card0/device/pp_sclk_od

Not overclock your GPU core by 10%?

GloriousEggroll commented 5 years ago

no. it increases the max limit that it is able to clock to IF manually set.

think of it like this: you put the gpu in a box. you then take the gpu out, and put it in a bigger box, 10% bigger. the gpu (core) still stays the same size (clock) unless you manually set it to a larger size (core)

so if I wanted to overclock the card to 1640, I would have to both set the 10% limit and then set the p7 state clock speed to 1640. Without setting the p7 clock speed it stays at 1590 even though you have the option now to make it go to 1640.

BoukeHaarsma23 commented 5 years ago

I see, I misread the documentation on this. Thanks for pointing this out!

BoukeHaarsma23 commented 5 years ago

However, this does suggest otherwise: https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-OverDrive-Linux-4.15

GloriousEggroll commented 5 years ago

he used it with the automatic default scaling values. the option should not be set if not setting manual values otherwise the gpu will try to auto-scale overclock to the max value in the range. So what you would want to do is turn it on only if the user enables manual clock setting mode, then if they disable manual clock mode set it back to 0.

BoukeHaarsma23 commented 5 years ago

Ah gotcha, working on it ;) Would first need a root implementation for this though, so working on that first