foucault / nvfancontrol

NVidia dynamic fan control for Linux and Windows
GNU General Public License v3.0
206 stars 20 forks source link

XNVCtrl SetAttr(THERMAL_COOLER_LEVEL) failed #31

Closed project0 closed 3 years ago

project0 commented 3 years ago

Not sure what is the problem here, just recently updated my (arch linux) system and it stopped working The only relation i can see is the jump of the nvidia driver from 460 to 465.

nvfancontrol -d
WARN - No config file found; using default curve
DEBUG - Default configuration loaded
DEBUG - [[gpu]]
id = 0
enabled = true
points = [[41, 20], [49, 30], [57, 45], [66, 55], [75, 63], [78, 72], [80, 80]]
DEBUG - Curve points: [(41, 20), (49, 30), (57, 45), (66, 55), (75, 63), (78, 72), (80, 80)]
INFO - NVIDIA driver version: 465.24.02
INFO - NVIDIA graphics adapter #0: NVIDIA GeForce GTX 1070
INFO -   GPU #0 coolers: COOLER-0
ERROR - Could not update fan speed: XNVCtrl SetAttr(THERMAL_COOLER_LEVEL) failed; error 0
foucault commented 3 years ago

Hello! There are literally no differences in XNVCtrl (the NVIDIA interface library) between 460.xx and 465.xx so this shouldn't happen. Can you please verify that your xorg.conf is indeed correct? Can you also check the output of nvfancontrol -d -m ?

See below ↓

foucault commented 3 years ago

You can't set fanspeed from nvidia-settings either. Stop nvfancontrol open nvidia-settings go to GPU-0 → Thermal Settings → Enable GPU Fan Settings → Put it to 60 or something → Click Apply → Get "Failed to set new Fan Speed!" Given that there have been NO differences in the XNVCtrl library between the two version this is most likely a driver bug. This looks like it's affecting overclocking as well [1], [2], [3],

jlarmstrong commented 3 years ago

Just confirming here that I am also seeing this issue in the latest 460.73.1 release.

foucault commented 3 years ago

Possibly some of the code in 465 was backported into the 460 branch. I highly doubt we'll see any fix for that before 470.

vandonsel commented 3 years ago

I am able to set fan speed in nvidia-settings and I am also getting a very similar error. Please inform me if this warrants a new ticket:

INFO - Loading configuration file: "/home/d/.config/nvfancontrol.conf"
DEBUG - Curve points: [(30, 20), (40, 30), (50, 50), (60, 70), (70, 100)]
INFO - NVIDIA driver version: 465.24.02
INFO - NVIDIA graphics adapter #0: NVIDIA GeForce GTX 1050 Ti
INFO -   GPU #0 coolers: COOLER-0
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "XNVCtrl QueryAttr(COOLER_SPEED) failed; error 0"', src/main.rs:715:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
DEBUG - Resetting fan control
foucault commented 3 years ago

Try putting nvfancontrol in monitor mode (nvfancontrol -d -m) and change the values from nvidia-settings. Is there any change in the output of nvfancontrol?

vandonsel commented 3 years ago

Looks like I get the same error running it with those arguments

[d@virtlove ~]$ nvfancontrol -d -m
INFO - Loading configuration file: "/home/d/.config/nvfancontrol.conf"
DEBUG - Curve points: [(30, 20), (40, 30), (50, 50), (60, 70), (70, 100)]
INFO - NVIDIA driver version: 465.27
INFO - NVIDIA graphics adapter #0: NVIDIA GeForce GTX 1050 Ti
INFO -   GPU #0 coolers: COOLER-0
INFO - Option "-m" is present; curve will have no actual effect
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "XNVCtrl QueryAttr(COOLER_SPEED) failed; error 0"', src/main.rs:715:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[d@virtlove ~]$
foucault commented 3 years ago

Did it work with an older driver? Because I cannot reproduce the problem. It looks like the nvidia driver fails to respond when querying for a specific attribute. It looks like we can't query the driver at all which would happen if the attribute we are looking for (NV_CTRL_THERMAL_COOLER_SPEED) is not existent at all. I don't think this is directly related to the bug outlined here, but it might be if it was working before (pre 465).

reytiger commented 3 years ago

I experienced ERROR - Could not update fan speed: XNVCtrl SetAttr(THERMAL_COOLER_LEVEL) failed; error 0 on Arch Linux and found I could not alter fan settings in nvidia-settings.

Kernel: 5.12.6-arch1-1 Driver: nvidia 465.31-4 Util: nvidia-settings 465.27-1

Following these steps fixed my issue, indicating this was a configuration/permissions problem for rootless Xorg.

reytiger commented 3 years ago

I experienced ERROR - Could not update fan speed: XNVCtrl SetAttr(THERMAL_COOLER_LEVEL) failed; error 0 on Arch Linux and found I could not alter fan settings in nvidia-settings.

Kernel: 5.12.6-arch1-1 Driver: nvidia 465.31-4 Util: nvidia-settings 465.27-1

Following these steps fixed my issue, indicating this was a configuration/permissions problem for rootless Xorg.

In particular, /etc/X11/Xwrapper.config was missing on my system.

foucault commented 3 years ago

It's a shame that you need to run X as root to make this work. But in any case this is not an nvfancontrol bug. I'll update the README. Thanks for reporting and investigating !