run-ai / genv

GPU environment and cluster management with LLM support
https://www.genv.dev
GNU Affero General Public License v3.0
445 stars 19 forks source link

New GPU Addition Not Showing #58

Closed anindyamaiti closed 3 months ago

anindyamaiti commented 3 months ago

I just added a fourth GPU to my desktop. genv devices does not show the new index, and can't attach to --index 3.

Tried pip uninstall and pip install (latest release), but that didn't work. Any suggestions will be very appreciated.

(base) anindya@SGPUW2:~$ nvidia-smi
Tue Mar 19 15:33:50 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:04:00.0 Off |                  N/A |
|  0%   34C    P8              2W /  165W |      19MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 4080        Off |   00000000:17:00.0 Off |                  N/A |
|  0%   42C    P8             19W /  320W |       9MiB /  16376MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:65:00.0 Off |                  N/A |
|  0%   35C    P8              3W /  165W |       8MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 8000                Off |   00000000:B3:00.0 Off |                  Off |
| 33%   31C    P8              6W /  260W |       5MiB /  49152MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2737      G   /usr/lib/xorg/Xorg                             14MiB |
|    1   N/A  N/A      2737      G   /usr/lib/xorg/Xorg                              4MiB |
|    2   N/A  N/A      2737      G   /usr/lib/xorg/Xorg                              4MiB |
|    3   N/A  N/A      2737      G   /usr/lib/xorg/Xorg                              4MiB |
+-----------------------------------------------------------------------------------------+
(base) anindya@SGPUW2:~$ genv devices
ID      ENV ID      ENV NAME        ATTACHED
0
1
2
(base) anindya@SGPUW2:~$
razrotenberg commented 3 months ago

hey @anindyamaiti thanks for reaching out!

can you try resetting Genv state with:

genv devices --reset

Let me know if it worked for you!

anindyamaiti commented 3 months ago

That worked beautifully. Thank you for the quick response! 😊

razrotenberg commented 3 months ago

Sure thing!😀