Syllo / nvtop

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Other
8.1k stars 293 forks source link

No GPU to monitor. Fedora 35 with RPM Fusion driver #128

Closed czarekkwasny closed 2 years ago

czarekkwasny commented 2 years ago

Hello,

I followed the instructions in the Readme file, compiled the nvtop from source and have the RPM Fusion Driver installed. nvidia-smi outputs:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44       Driver Version: 495.44       CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:02:00.0  On |                  N/A |
|  0%   51C    P8    13W / 170W |    443MiB / 12053MiB |     34%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2459      G   /usr/libexec/Xorg                 238MiB |
|    0   N/A  N/A      2544      G   /usr/bin/gnome-shell               47MiB |
|    0   N/A  N/A      4835      G   /usr/lib64/firefox/firefox        118MiB |
+-----------------------------------------------------------------------------+

When I try to run nvtop the output is:

No GPU to monitor.

I tried following the similar issue description, but the libraries seem linked:

ldconfig -p | grep libnvidia-ml
    libnvidia-ml.so.1 (libc6,x86-64) => /lib64/libnvidia-ml.so.1
    libnvidia-ml.so.1 (libc6) => /lib/libnvidia-ml.so.1
    libnvidia-ml.so (libc6,x86-64) => /lib64/libnvidia-ml.so

Nvtop works with the nvidia driver downloaded from the website however... Is there any way to make it run with RPM Fusion driver? (there are issues I have with the driver from nvidia website so I need to stay with RPM Fusion version)

czarekkwasny commented 2 years ago

Turned out it was a library mismatch that resulted from the driver update. The situation was as follows:

lrwxrwxrwx. 1 root root      22 Nov  2 16:41 libnvidia-ml.so.1 -> libnvidia-ml.so.495.44
lrwxrwxrwx. 1 root root      25 Oct 12 13:21 libnvidia-ml.so -> libnvidia-ml.so.495.29.05
-rwxr-xr-x. 1 root root 1840344 Sep 30 17:50 libnvidia-ml.so.495.29.05
-rwxr-xr-x. 1 root root 1840344 Oct 22 08:05 libnvidia-ml.so.495.44

so nvtop was confused trying to load libnvidia-ml.so which pointed to old dll. Once the link was updated to proper location the issue disappeared.