HackTestes / NVML-GPU-Control

A small program that uses the NVIDIA Management Library to control the GPU independent of OS or display server. This project is not provided or endorsed by NVIDIA
GNU General Public License v2.0
28 stars 2 forks source link

Control not working. #3

Open g8392 opened 1 week ago

g8392 commented 1 week ago

None of the controls seem to work but using "sudo python3 nvml_gpu_control.py fan-policy --auto -n 'NVIDIA GeForce RTX 3090' " does slow the fans for a moment (my problem is that they spin fast all the time) and then the fans start spinning again.

example of the output while doing this:

LOG[2024-09-23 02:16:45]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:46]: Current temp: 28°C LOG[2024-09-23 02:16:46]: Current speed: 50% LOG[2024-09-23 02:16:46]: Fan controller speed 0: 50% LOG[2024-09-23 02:16:46]: Fan controller speed 1: 50% LOG[2024-09-23 02:16:46]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:47]: Current temp: 28°C LOG[2024-09-23 02:16:47]: Current speed: 50% LOG[2024-09-23 02:16:47]: Fan controller speed 0: 50% LOG[2024-09-23 02:16:47]: Fan controller speed 1: 50% LOG[2024-09-23 02:16:47]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:48]: Current temp: 28°C LOG[2024-09-23 02:16:48]: Current speed: 50% LOG[2024-09-23 02:16:48]: Fan controller speed 0: 50% LOG[2024-09-23 02:16:48]: Fan controller speed 1: 50% LOG[2024-09-23 02:16:48]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:49]: Current temp: 28°C LOG[2024-09-23 02:16:49]: Current speed: 50% LOG[2024-09-23 02:16:49]: Fan controller speed 0: 50% LOG[2024-09-23 02:16:49]: Fan controller speed 1: 50% LOG[2024-09-23 02:16:49]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:51]: Current temp: 28°C LOG[2024-09-23 02:16:51]: Current speed: 33% LOG[2024-09-23 02:16:51]: Fan controller speed 0: 33% LOG[2024-09-23 02:16:51]: Fan controller speed 1: 37% LOG[2024-09-23 02:16:51]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:52]: Current temp: 28°C LOG[2024-09-23 02:16:52]: Current speed: 30% LOG[2024-09-23 02:16:52]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:52]: Fan controller speed 1: 30% LOG[2024-09-23 02:16:52]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:53]: Current temp: 28°C LOG[2024-09-23 02:16:53]: Current speed: 30% LOG[2024-09-23 02:16:53]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:53]: Fan controller speed 1: 30% LOG[2024-09-23 02:16:53]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:54]: Current temp: 28°C LOG[2024-09-23 02:16:54]: Current speed: 30% LOG[2024-09-23 02:16:54]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:54]: Fan controller speed 1: 34% LOG[2024-09-23 02:16:54]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:55]: Current temp: 28°C LOG[2024-09-23 02:16:55]: Current speed: 30% LOG[2024-09-23 02:16:55]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:55]: Fan controller speed 1: 39% LOG[2024-09-23 02:16:55]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:56]: Current temp: 28°C LOG[2024-09-23 02:16:56]: Current speed: 30% LOG[2024-09-23 02:16:56]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:56]: Fan controller speed 1: 43% LOG[2024-09-23 02:16:56]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:57]: Current temp: 28°C LOG[2024-09-23 02:16:57]: Current speed: 30% LOG[2024-09-23 02:16:57]: Fan controller speed 0: 30% LOG[2024-09-23 02:16:57]: Fan controller speed 1: 45% LOG[2024-09-23 02:16:57]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:58]: Current temp: 28°C LOG[2024-09-23 02:16:58]: Current speed: 32% LOG[2024-09-23 02:16:58]: Fan controller speed 0: 32% LOG[2024-09-23 02:16:58]: Fan controller speed 1: 47% LOG[2024-09-23 02:16:58]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:16:59]: Current temp: 28°C LOG[2024-09-23 02:16:59]: Current speed: 34% LOG[2024-09-23 02:16:59]: Fan controller speed 0: 34% LOG[2024-09-23 02:16:59]: Fan controller speed 1: 48% LOG[2024-09-23 02:16:59]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:17:00]: Current temp: 28°C LOG[2024-09-23 02:17:00]: Current speed: 35% LOG[2024-09-23 02:17:00]: Fan controller speed 0: 35% LOG[2024-09-23 02:17:00]: Fan controller speed 1: 49% LOG[2024-09-23 02:17:00]: Found no temperature match, using default fan speed: 50 LOG[2024-09-23 02:17:01]: Current temp: 28°C LOG[2024-09-23 02:17:01]: Current speed: 37% LOG[2024-09-23 02:17:01]: Fan controller speed 0: 37% LOG[2024-09-23 02:17:01]: Fan controller speed 1: 49%

Uses proprietary NVIDIA driver

Driver version: 560.35.03

GPU: NVIDIA GeForce RTX 3090 Operating system and version: Pop!_OS 22.04 LTS Display server:

X11/Xorg

Python version: Python 3.10.14

HackTestes commented 3 days ago

Let me see what is going on.

Let me start with the command you provided:

sudo python3 nvml_gpu_control.py fan-policy --auto -n 'NVIDIA GeForce RTX 3090'

This commands essentially disables manual control o the fans, giving it back to the vBIOS in the card. The help output in the README: "--auto Sets the fan policy to automatic (vBIOS contolled)".

does slow the fans for a moment (my problem is that they spin fast all the time) and then the fans start spinning again

The behavior you are getting depends on the vBIOS configuration and the workload you were running at the time (I also hope you are not running multiple instances of this program as they are all trying to send conflicting commands to the GPU). I might need more info:

The log output

If you are using actions such as "fan-control" or "control-all" and are getting changes in speed, it might be a fan speed threshold. Example: my card turns off the fans if I set them at anything below 48%, causing the fans to turn-off and turn-on (making the speed oscillate a lot). I can only write commands that go at 0% or 50%, so no 30%.

I also might add a new action just to print the current fan policy, since it is only printed when you try to change the fan policy.