kobalicek / amdtweak

A library that can be used to manipulate ATOM and PowerPlay (PP) tables of AMDGPUs
71 stars 20 forks source link

Cannot write to pp table #4

Closed tacchinotacchi closed 6 years ago

tacchinotacchi commented 7 years ago

I'm running linux 4.12 ( with the proprietary opencl kernel from amdgpu-pro, but it doesn't matter right? ) with an RX 470 and trying to reproduce the example in the Readme, I noticed I can't write to PowerTuneTable.TDP nor to PowerTuneTable.ConfigurableTDP. As soon as I do, the pp table becomes permanently ( until reboot ) irresponsive. As in cat /sys/class/drm/card0/device/pp_table never returns, amdtweak hangs forever if used to read it and radeon-profile hangs as well. Also, Unigine Valley slows down from 40 fps to 10. Is it because the clocks lower to the mininum? I don't know, I have no way to check without the pp table lol. Any clue? I haven't tried writing any other variable yet, but I suspect it will be the same

kobalicek commented 7 years ago

Hello, I would definitely try calling dmesg after updating the table, maybe it's driver issue. I was also using amdgpu-pro drivers and it worked for me for RX480 cards, but the drivers were extremely buggy and sometimes crashed (which meant to restart the machine in the best case).

I'm not sure if this will ever be fixed as AMD doesn't really care of older cards.

I would also try updating the table without any changes, to just check whether the driver works. When you write to pp_table the driver reinitializes its hardware manager instance, maybe bug is in there.

Also, trying other tools like OhGodATool could help to validate whether it's driver issue or not. I don't have any AMD GPUs left so I cannot really test it anymore.

Perfect-Web commented 7 years ago

i have the same problem, with the latest drivers i try to change the tdp, or anything else, even nothing, and when the pp is written i get

[   72.501400] amdgpu: [powerplay] Invalid PowerPlay Table!
[   72.501401] amdgpu: [powerplay] init_clock_voltage_dependency failed
[   72.501402] amdgpu: [powerplay] amdgpu: powerplay initialization failed
[  122.541609] BUG: unable to handle kernel NULL pointer dereference at           (null)

this doesnt happen with OhGodATool

kobalicek commented 7 years ago

Okay guys, I don't know what is the problem, but I wrote a tool to compare input/output without any changes and noticed that it doesn't match. I must have introduced the bug during binlib rewriting. Will check it out and add some tests so it won't happen again.

kobalicek commented 7 years ago

Committed a fix, tested locally on some PP tables that I have, please check it out.

kobalicek commented 6 years ago

Closing, this should be fixed now.

tacchinotacchi commented 6 years ago

Actually, the problem still persists.. I'd like to have send the kernel log but my dmesg seems to be clogged by random input events..

EDIT: here you go

[ 349.122250] amdgpu: [powerplay] Unsupported fan table format! [ 349.122251] amdgpu: [powerplay] init_thermal_controller failed [ 349.122252] amdgpu: [powerplay] amdgpu: powerplay initialization failed