This is a small program that uses the NVIDIA Management Library (NVML) to monitor GPU temperature and set fan speed. NVML is being used, because it is OS and display sever agnostic (that means it doesn't depend on X11 or Windows). Another important reason is that the official NVIDIA tool (NVIDIA smi) does not currently support fan control.
To use it, you must have installed:
You will also need admin/root privileges to be able to set the fan speed.
Because of multiple reasons:
Now that NVIDIA added the functions to work on any CUDA supported card on drivers equal or higher than v520 (see Change Log here), it is possible to control GeForce cards' fans through NVML! This means that I can get perfect Wayland support as well, since NVML doesn't depend on a display server.
Note: you may need to adapt the path of some of the commands
git clone https://github.com/HackTestes/NVML-GPU-Control NVML_GPU_Control
The next part requires admin/root permissions
# Windows
mkdir 'C:\Program Files\User_NVIDIA_GPU_Control\'
mkdir '/usr/bin/User_NVIDIA_GPU_Control/'
3. Copy the scripts files from the repository to the new directory
cp 'C:\Path_to_the_repository\NVML_GPU_Control\src*' 'C:\Program Files\User_NVIDIA_GPU_Control\'
cp '/Path_to_the_repository/NVML_GPU_Control/src/*' '/usr/bin/User_NVIDIA_GPU_Control\'
**Additional notes**: you may also need to install the library as admin or install it as a normal user and then lock the files(change the permissions and take ownership as root/admin).
### Uninstall
You only need to remove the directory (BE EXTRA CAREFUL WITH THE *RM* COMMAND). You can also use the GUI to simply delete the directory if you find that easier and safer.
Useful docs (read before running the commands):
* [Remove-Item](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/remove-item?view=powershell-7.4)
* [rm man page](https://man7.org/linux/man-pages/man1/rm.1.html)
Remove-Item -Confirm -Force -Recurse -Path 'C:\Program Files\User_NVIDIA_GPU_Control\'
rm --interactive --preserve-root -R '/usr/bin/User_NVIDIA_GPU_Control'
## How to use
- Make sure to run with the working directory being the `.\src`
cd ./src
- You must first list all cards that are connected, so you can get the name or UUID
python.exe ./nvml_gpu_control.py list
- Then you can select a target by name
python.exe ./nvml_gpu_control.py fan-control -n 'NVIDIA GeForce RTX 4080'
- And the fan speed for each temperature level
sudo python.exe ./nvml_gpu_control.py fan-control -n 'NVIDIA GeForce RTX 4080' -sp '10:35,20:50,30:50,35:100'
- You could also use the `--dry-run` for testing!
python.exe ./nvml_gpu_control.py fan-control -n 'NVIDIA GeForce RTX 4080' -sp '10:35,20:50,30:50,35:100' --dry-run
- You can also revert to the original state
python.exe ./nvml_gpu_control.py fan-policy --auto -n 'NVIDIA GeForce RTX 4080'
Note that it does not current support fan curve (or linear progression), so it works on levels. Each level the temperature is verified against the configuration (higher or equal) and then set properly. Also, each temperature associated with speed is ordered automatically. (think of it as a staircase graph)
Temp : speed(%)
40 : 100 (>=40°C - 100%)
30 : 50 (>=30°C - 50%)
20 : 30 (>=20°C - 30%)
Default speed (DS)
41°C - 100%
21°C - 30%
19°C - Default speed
#### Usage docs
python.exe .\nvml_gpu_control.py
ACTIONS help Display help text
list
List all available GPUs connected to the system by printing its name and UUID
fan-control
Monitor and controls the fan speed of the selected card (you must select a target card)
fan-info
Shows information about fan speed
fan-policy <--auto|--manual>
Changes the fan control policy to automatic (vBIOS controlled) or manual. Note that when the fan speed is changed, the NVML library automatically changes this setting to manual. This setting is useful to change the GPU back to its original state
fan-policy-info
Shows information about the current fan policy
power-limit-info
Shows information about the power limit of the selected GPU
power-control
Controls the power limit of the selected GPU. It runs in a loop by default, but can run once using the --single-use option
thresholds-info
Shows information about temperature thresholds in dregrees Celsius of the selected GPU.
temp-control
Controls the temperature thresholds configuration of the selected GPU. It runs in a loop by default, but can run once using the --single-use option
control-all
Allows the use of all controls in a single command/loop
OPTIONS
--name OR -n <GPU_NAME>
Select a target GPU by its name. Note: UUID has preference over name
--uuid OR -id <GPU_UUID>
Select a target GPU by its Universally Unique IDentifier (UUID). Note: UUID has preference over name
--time-interval OR -ti <TIME_SECONDS>
Time period to wait before probing the GPU again. Works for all actions that run in a loop
--dry-run OR -dr
Run the program, but don't change/set anything. Useful for testing the behavior of the program
--speed-pair OR -sp <TEMP_CELSIUS:SPEED_PERCENTAGE,TEMP_CELSIUS:SPEED_PERCENTAGE...>
A comma separated list of pairs of temperature in celsius and the fan speed in % (temp:speed) defining basic settings for a fan curve
--default-speed OR -ds <FAN_SPEED_PERCENTAGE>
Set a default speed for when there is no match for the fan curve settings
--manual
Sets the fan policy to manual
--auto
Sets the fan policy to automatic (vBIOS controlled)
--power-limit OR -pl <POWER_LIMIT_WATTS>
Sets the power limit of the GPU in watts
--acoustic-temp-limit OR -tl <TEMPERATURE_CELSIUS>
Sets the acoustic threshold in celsious (note that this is the same temperature limit used by GeForce Experience)
--single-use OR -su
Makes some actions work only once insted of in a loop. This option is valid for: temp-control and power-control
##### Running tests
python.exe ./src/tests.py -b
### Setting up services or tasks (under development)
This section will present some simple commands to setup services or tasks that start as admin and run the configured program with the configured settings. You should secure the files under an admin only folder, so only authorized programs can modify the scripts (and DON'T use SUID in Linux).
#### Windows
Please, check Microsoft's documentation:
- [Task scheduler](https://learn.microsoft.com/en-us/windows/win32/taskschd/task-scheduler-start-page)
- [Task scheduler command line](https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/schtasks)
Since this program does not implement the service API, it will be using scheduled tasks to run at startup. There will be presented a GUI and a command line guide to how to do the setup:
##### GUI
1. Make sure to have the script files at a path only accessible to admin users. This guide will be using `C:\Program Files\User_NVIDIA_GPU_Control\`
2. Open Task Scheduler as an admin (you might need to select a admin user)
3. Click on `create task` (do not confuse it for the create **simple** task)
![TaskScheduler_CreateTask](img/TaskScheduler_CreateTask.png)
4. General tab -> Write the service name. This guide will use: `User NVIDIA GPU Control Task`
5. General tab -> Write a description. This guide will use: `This task runs a daemon at startup responsible for controling NVIDIA GPUs' fans and power`
6. General tab -> Mark the box containing `Run whether the user is logged or not`
7. General tab -> Mark the box containing `Do not store password`
8. General tab -> Mark the box containing `Run with highest privileges`
![TaskScheduler_GeneralTab](img/TaskScheduler_GeneralTab.png)
9. Triggers tab -> Create a new trigger and change the `Begin the task` to `At Startup` (make sure to leave the Enabled box marked)
![TaskScheduler_TriggersTab](img/TaskScheduler_TriggersTab.png)
![TaskScheduler_TriggersTab_2](img/TaskScheduler_TriggersTab_2.png)
10. Actions tab -> Create a new action and select the `action` `Start a program`
11. Actions tab -> In the `Program/script` put the path of the python executable. This guide wil use `"C:\Program Files\Python312\python.exe"` (Note that some python versions may have a different directory name and make sure only admin users can change the executable and the folder) - the double quotes are necessary
12. Actions tab -> In the `Add arguments (optional)`, add the script path and the desired settings. This guide will use the following args: `"C:\Program Files\User_NVIDIA_GPU_Control\nvml_gpu_control.py" "fan-control" "-n" "NVIDIA GeForce RTX 4080" "-sp" "10:0,20:50,35:100"`
or
"C:\Program Files\User_NVIDIA_GPU_Control\nvml_gpu_control.py" "control-all" "-n" "NVIDIA GeForce RTX 4080" "-pl" "305" "-tl" "65" "-sp" "10:0,20:50,35:100"
13. Actions tab -> In the `Start in (optional)`, add the script path directory. This guide will use the following args: `C:\Program Files\User_NVIDIA_GPU_Control`
![TaskScheduler_ActionsTab](img/TaskScheduler_ActionsTab.png)
14. Conditions tab -> Leave all boxes UNmarked
![TaskScheduler_ConditionsTab](img/TaskScheduler_ConditionsTab.png)
15. Settings tab -> Mark the box in `Allow task to be run on demand`
16. Settings tab -> UNmark the box in `Stop task if it runs longer than`
17. Settings tab -> Mark the box in `If the running task does not end when requested, force it to stop`
18. Settings tab -> In the `If the task is already running, then the following rule applies`, select the `Do not start a new instance`
![TaskScheduler_SettingsTab](img/TaskScheduler_SettingsTab.png)
##### Command line (Not recommended and untested)
Some users might find easier to simply run a command, however, it is important to warn about two things:
1. The command line utility has less features than the GUI version;
2. If you are unsure of what the command does, please check MS's documentation before running it (especially because you must run it with admin permissions)
1. Open a terminal with admin permissions
2. Write the following command: `schtasks /create /tn 'User NVIDIA GPU Control Task' /tr 'C:\Program Files\Python312\python.exe C:\Program Files\User_NVIDIA_GPU_Control\nvml_gpu_control.py fan-control -t "NVIDIA GeForce RTX 4080" -sp "10:0,20:47,30:50,35:100"' /sc ONSTART /np /rl HIGHEST`
Another formatting
schtasks /create /tn 'User NVIDIA GPU Control Task' /tr 'C:\Program Files\Python312\python.exe C:\Program Files\User_NVIDIA_GPU_Control\nvml_gpu_control.py fan-control -n "NVIDIA GeForce RTX 4080" -sp "10:0,20:47,30:50,35:100"' /sc ONSTART /np /rl HIGHEST
One of the limitations involve not being able to change the start working directory, so some paths in the scripts might break. Overall, I do not recommend this approach on Windows, users should opt for the GUI method.
#### Linux (systemd / cronjob) - **Not ready**
##### Systemd timer
1. Create a service file
[Unit] Description=Unofficial NVIDIA Fan Control service ConditionUser=0
[Service] Type=simple WorkingDirectory=/usr/bin/User_NVIDIA_GPU_Control/ ExecStart=/usr/bin/python3 /usr/bin/User_NVIDIA_GPU_Control/nvml_gpu_control.py -n "RTX 3080" -sp 10:20,20:35:30:50,35:100 Restart=always KillSignal=SIGQUIT
[Install] WantedBy=multi-user.target
## Security considerations
### Windows
1. Having an admin prompt under the same desktop
An opened prompt under the same desktop can receive key command from non-privileged programs, allowing any program to escalate to admin. To mitigate this it is necessary to restrict all other programs with a UI limit JobObject, create the window under a new desktop or not create any windows on the desktop (this is how it is done under the guide).
2. Programs that start automatically as admin must be secured against writes
The scripts and the executables can only be written by admin users, otherwise, another program may overwrite them and gain admin rights on the machine. Please, verify the permissions set on the python executable and on the scripts (this also applies to the library nvidia-ml-py).
### Linux
1. Having an admin prompt under the same desktop (X11)
This is a similar risk to the Windows counterpart, especially on X11/Xorg. So, if you use X11, you must create a new session under a new TTY to create an admin window; but if you use Wayland, it already isolates windows by default.
2. Programs that start automatically as admin must be secured against writes
Same as Windows. All of the executables and scripts must be accessible only to the root user (UID 0).
## Roadmap (features to be added)
### Must have
- [x] Fan control
- [x] Select GPU by name
- [x] Display fan speed per controller
- [x] Control fan policy
- [x] Select GPU by UUID (allows users to control more than 1 GPU individually that shares the same model - e.g. 2 RTXs 4080)
- [ ] Run at startup with necessary permissions (Windows and Linux) - Windows already works
- [x] Power limit control
- [x] Temperature threshold control
- [x] Enable all controls
- [x] Help action must not require NVML initialization
### Can consider (nice to have)
- [ ] Logging to file option (with message size limit) -> user can spawn another instance with the same arguments and pass the `--dry-run` option as it should mirror the output of the privileged one
- [ ] Temperature curves (linear, quadratic, logarithmic...) -> might be unecessary as users can generate all speed points elsewhere and just pass it as arguments
## Support
I will be suporting this program as long as I have NVIDIA GPUs (especially bacause I am also dogfooding it). Don't expect new features as it has everything currently I need, but you can suggest new features that you think is useful (note that the focus is energy and temperature control to increase **stability**). You can expect however bug fixes from me so my project remains compatible with the latest versions of NVML.
If I loose the need for this software (aka change my hardware), I will make sure to update this notice.
## Contribute
Just a few guidelines and style decisions:
- variable_name
- function_name
- ObjectOrClassName
- Other dependencies are DISALLOWED, I want to limit the dependencies as a security measure (just remember the xz incident). You are free to try to convince me, but your contribution will most likely be rejected
- Code should be testable, so please include unit tests to your code. If you think that certain parts are just too hard to make tests, include a juntification