Dreaming-Codes / nvidia_oc

A simple command line tool to overclock Nvidia GPUs using the NVML library on Linux. This supports both X11 and Wayland.
https://crates.io/crates/nvidia_oc
MIT License
12 stars 0 forks source link

Error on run #2

Closed pallebone closed 3 weeks ago

pallebone commented 1 month ago

I can alter memory offset but if I try to alter freq offset I get an error:

sudo ./nvidia_oc set --index 0 --freq-offset 272 --mem-offset 1000 thread 'main' panicked at src/main.rs:73:26: Failed to set GPU frequency offset: "Error code: 6" note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Tried contacting you on discord. Not a friend.

pallebone commented 1 month ago

sudo RUST_BACKTRACE=1 ./nvidia_oc set --index 0 --freq-offset 272 --mem-offset 1000 thread 'main' panicked at src/main.rs:73:26: Failed to set GPU frequency offset: "Error code: 6" stack backtrace: 0: rust_begin_unwind 1: core::panicking::panic_fmt 2: core::result::unwrap_failed 3: nvidia_oc::main note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.

pallebone commented 1 month ago

sudo RUST_BACKTRACE=full ./nvidia_oc set --index 0 --freq-offset 272 --mem-offset 1000 thread 'main' panicked at src/main.rs:73:26: Failed to set GPU frequency offset: "Error code: 6" stack backtrace: 0: 0x55fd14612d05 - ::fmt::h1e1a1972118942ad 1: 0x55fd1463434b - core::fmt::write::hc090a2ffd6b28c4a 2: 0x55fd14610cff - std::io::Write::write_fmt::h8898bac6ff039a23 3: 0x55fd14612ade - std::sys_common::backtrace::print::ha96650907276675e 4: 0x55fd14613d99 - std::panicking::default_hook::{{closure}}::h215c2a0a8346e0e0 5: 0x55fd14613add - std::panicking::default_hook::h207342be97478370 6: 0x55fd14614233 - std::panicking::rust_panic_with_hook::hac8bdceee1e4fe2c 7: 0x55fd14614114 - std::panicking::begin_panic_handler::{{closure}}::h00d785e82757ce3c 8: 0x55fd146131c9 - std::sys_common::backtrace::__rust_end_short_backtrace::h1628d957bcd06996 9: 0x55fd14613e47 - rust_begin_unwind 10: 0x55fd14546df3 - core::panicking::panic_fmt::hdc63834ffaaefae5 11: 0x55fd14547206 - core::result::unwrap_failed::h82b551e0ff2b2176 12: 0x55fd1455b80b - nvidia_oc::main::h5d23fbf1455d3ebc 13: 0x55fd1456acd3 - std::sys_common::backtrace::rust_begin_short_backtrace::h62cf2716c9e91ff3 14: 0x55fd1456b949 - std::rt::lang_start::{{closure}}::h84d874c721ca05fb 15: 0x55fd1460c110 - std::rt::lang_start_internal::h3ed4fe7b2f419135 16: 0x55fd1455e045 - main 17: 0x7f007ab02c8a - libc_start_call_main at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 18: 0x7f007ab02d45 - __libc_start_main_impl at ./csu/../csu/libc-start.c:360:3 19: 0x55fd145473e5 - _start 20: 0x0 -

Dreaming-Codes commented 1 month ago

Hi @pallebone,

According to the NVML documentation, the error code you encountered is described as follows:

NVML_ERROR_NOT_FOUND = 6 A query to find an object was unsuccessful.

This indicates that the issue originates from NVIDIA's side, not from my application. Therefore, my ability to resolve it is limited. However, if you can provide the exact model of your GPU, I can conduct further research to identify any potential workarounds.

pallebone commented 1 month ago

Geforce GTX 1050 Strange, under xorg and greenwithenvy it overclocks fine. Thought I would try wayland and your script.

Dreaming-Codes commented 1 month ago

Geforce GTX 1050 Strange, under xorg and greenwithenvy it overclocks fine. Thought I would try wayland and your script.

I'm not certain, but I will check the NVML API for alternative methods. There seem to be some legacy methods that might work with your GPU.

Please note that GreenWithEnvy uses a completely different method, which is only possible under Xorg due to the special integration with the NVIDIA driver.

Dreaming-Codes commented 1 month ago

@pallebone are you using the open source nvidia drivers?

Dreaming-Codes commented 1 month ago

By the way, the error you're encountering is quite unusual. The NVIDIA NVML documentation does not list that error as a possible result. image

pallebone commented 1 month ago

@pallebone are you using the open source nvidia drivers?

No I am using nvidas driver 535.183.01 installed by apt via apt-get install nvidia-driver

Dreaming-Codes commented 1 month ago

@pallebone are you using the open source nvidia drivers?

No I am using nvidas driver 535.183.01 installed by apt via apt-get install nvidia-driver

That's quite an old version. On my system, I have 555.58.02. That could be the issue. Unfortunately, I can't test that version since I'm using Arch, and I don't want to deal with the dependency hell to install such an outdated version.

pallebone commented 1 month ago

Ok I will look to update then and let you know

pallebone commented 1 month ago

I checked and the current rivers on Nvidias website are:

Linux x86_64/AMD64/EM64T Latest Production Branch Version: 550.100 Latest New Feature Branch Version: 555.58.02 Latest Beta Version: 560.28.03

The version you are using is the new feature branch. I may have to wait a little while for these drivers to be included into Debian Trixie. The next driver coming down in 2 weeks is 545.23.06-1.

I imagine after this then the next one will be a 555 or > version. However that is a month away. The current driver I am using is only 1 month old. However I will wait for the drivers to come down and retry.

This guide used 545.29.02 so in 1 month I should have the same or newer version to test with. https://www.reddit.com/r/linux_gaming/comments/17kx3vu/how_to_crudely_overclock_your_nvidia_gpu_on/

pallebone commented 1 month ago

Just FYI I tried with the next version to come down and have a similar issue, so I will wait for one more later version and see after this:

aragorn@Aragorn:~$ sudo RUST_BACKTRACE=full ./nvidia_oc set --index 0 --freq-offset 272 --mem-offset 1000
[sudo] password for aragorn: 
thread 'main' panicked at src/main.rs:73:26:
Failed to set GPU frequency offset: "Error code: 6"
stack backtrace:
   0:     0x558c003e3d05 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1e1a1972118942ad
   1:     0x558c0040534b - core::fmt::write::hc090a2ffd6b28c4a
   2:     0x558c003e1cff - std::io::Write::write_fmt::h8898bac6ff039a23
   3:     0x558c003e3ade - std::sys_common::backtrace::print::ha96650907276675e
   4:     0x558c003e4d99 - std::panicking::default_hook::{{closure}}::h215c2a0a8346e0e0
   5:     0x558c003e4add - std::panicking::default_hook::h207342be97478370
   6:     0x558c003e5233 - std::panicking::rust_panic_with_hook::hac8bdceee1e4fe2c
   7:     0x558c003e5114 - std::panicking::begin_panic_handler::{{closure}}::h00d785e82757ce3c
   8:     0x558c003e41c9 - std::sys_common::backtrace::__rust_end_short_backtrace::h1628d957bcd06996
   9:     0x558c003e4e47 - rust_begin_unwind
  10:     0x558c00317df3 - core::panicking::panic_fmt::hdc63834ffaaefae5
  11:     0x558c00318206 - core::result::unwrap_failed::h82b551e0ff2b2176
  12:     0x558c0032c80b - nvidia_oc::main::h5d23fbf1455d3ebc
  13:     0x558c0033bcd3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h62cf2716c9e91ff3
  14:     0x558c0033c949 - std::rt::lang_start::{{closure}}::h84d874c721ca05fb
  15:     0x558c003dd110 - std::rt::lang_start_internal::h3ed4fe7b2f419135
  16:     0x558c0032f045 - main
  17:     0x7fc213fccc8a - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  18:     0x7fc213fccd45 - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:360:3
  19:     0x558c003183e5 - _start
  20:                0x0 - <unknown>
aragorn@Aragorn:~$ nvidia-smi 
Wed Jul 31 16:04:52 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06              Driver Version: 545.23.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050        On  | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8              N/A / ERR! |      5MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      4115      G   /usr/bin/gnome-shell                          1MiB |
+---------------------------------------------------------------------------------------+
aragorn@Aragorn:~$ 
pallebone commented 3 weeks ago

Upgraded to the 560 driver and it resolved the issue.