jacklul / nvml-scripts

Scripts to control NVIDIA GPUs using NVML API
MIT License
14 stars 0 forks source link

I can't run the script #2

Open dapenp opened 2 months ago

dapenp commented 2 months ago
sudo python3 nvml-undervolt.py --core-offset 100 --target-clock 1800 --transition-clock 1500 --power-limit 85
Detected NVIDIA GeForce RTX 3050 Laptop GPU (GPU-6c80462b-e73e-761a-aa8d-0ba50f66ef6f)
Warning: Persistence mode is already enabled - make sure no other script is controlling clocks
Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 265, in main
    nvmlDeviceSetPowerManagementLimit(handle, args.power_limit * 1000)
  File "/usr/lib/python3.12/site-packages/pynvml.py", line 3435, in nvmlDeviceSetPowerManagementLimit
    _nvmlCheckReturn(ret)
  File "/usr/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 435, in <module>
    main()
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 416, in main
    nvmlDeviceSetPowerManagementLimit(handle, nvmlDeviceGetPowerManagementDefaultLimit(handle))
  File "/usr/lib/python3.12/site-packages/pynvml.py", line 3435, in nvmlDeviceSetPowerManagementLimit
    _nvmlCheckReturn(ret)
  File "/usr/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported

maybe I'm making some elementary mistake?

my OS arch linux 6.10.5-arch1-1 gpu rtx 3050 laptop nvidia driver version 555.58.02

jacklul commented 2 months ago

I lately noticed that pynvml package does not work, you need NVIDIA's nvidia-ml-py, did you install the first one?

dapenp commented 2 months ago

both packages are installed, is it a problem that I installed them via AUR and not via pip? python-pynvml and python-nvidia-ml-py

jacklul commented 2 months ago

Try removing pynvml one, it might be taking precedence

dapenp commented 2 months ago

there seems to be no difference

sudo python3 nvml-undervolt.py --core-offset 100 --target-clock 1800 --transition-clock 1500 --power-limit 85
[sudo] password for dapenop:
Detected NVIDIA GeForce RTX 3050 Laptop GPU (GPU-6c80462b-e73e-761a-aa8d-0ba50f66ef6f)
Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 265, in main
    nvmlDeviceSetPowerManagementLimit(handle, args.power_limit * 1000)
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 2791, in nvmlDeviceSetPowerManagementLimit
    _nvmlCheckReturn(ret)
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 833, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 435, in <module>
    main()
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 416, in main
    nvmlDeviceSetPowerManagementLimit(handle, nvmlDeviceGetPowerManagementDefaultLimit(handle))
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 2791, in nvmlDeviceSetPowerManagementLimit
    _nvmlCheckReturn(ret)
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 833, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported
jacklul commented 2 months ago

Try without power limit argument

I've updated the script adding few more try...catch blocks to let the script continue

dapenp commented 2 months ago
sudo python3 nvml-undervolt.py --core-offset 100 --target-clock 1800 --transition-clock 1500
[sudo] password for dapenop:
Detected NVIDIA GeForce RTX 3050 Laptop GPU (GPU-6c80462b-e73e-761a-aa8d-0ba50f66ef6f)
Warning: Persistence mode is already enabled - make sure no other script is controlling clocks
Running main loop (sleep = 0.5)...
Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 388, in main
    set_pstate_clocks(handle, NVML_CLOCK_GRAPHICS, 0, args.pstates)
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 143, in set_pstate_clocks
    struct = c_nvmlClockOffset_t()
             ^^^^^^^^^^^^^^^^^^^
NameError: name 'c_nvmlClockOffset_t' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 435, in <module>
    main()
  File "/home/dapenop/nvml-scripts/nvml-undervolt/nvml-undervolt.py", line 416, in main
    nvmlDeviceSetPowerManagementLimit(handle, nvmlDeviceGetPowerManagementDefaultLimit(handle))
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 2791, in nvmlDeviceSetPowerManagementLimit
    _nvmlCheckReturn(ret)
  File "/usr/lib/python3.12/site-packages/pynvml/nvml.py", line 833, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported
jacklul commented 2 months ago

This is quite weird, I just checked the AUR package source and it does include c_nvmlClockOffset_t.

dapenp commented 2 months ago

oh damn, I'm a fool, I didn't pay attention that I deleted python-nvidia-ml-py instead of python-pynvml, this time I already deleted the required package, and installed python-nvidia-ml-py, now everything seems to work

dapenp commented 2 months ago

it seems --core-offset doesn't work, at any value I don't see changes in gpu voltage, maybe I don't understand something?

dapenp commented 2 months ago

I tested before this in MSI Afterburner, there I managed to get a stable 1950 MHz at a voltage of 875 mV, that is, I need to set the offset to +- 180-200, but I do not see any changes

jacklul commented 2 months ago

Enable verbose mode, put a load on the GPU and show me what the log says

dapenp commented 2 months ago

image

Do I understand correctly that it is about the log of the script that he himself provides?

jacklul commented 2 months ago

I honestly have no idea why the offset is not getting set Might have to wait for me to switch to Linux before I can investigate this deeper

dapenp commented 2 months ago

I will look forward to it

jacklul commented 2 months ago

Does using nvidia-smi method work for you?

https://github.com/NVIDIA/open-gpu-kernel-modules/discussions/236#discussioncomment-3553564