gpuopenanalytics / pynvml

Provide Python access to the NVML library for GPU diagnostics
BSD 3-Clause "New" or "Revised" License
214 stars 30 forks source link

virtual GPU has brand number =10 which is not on the list #38

Closed danielbraun89 closed 2 years ago

danielbraun89 commented 3 years ago

my results for nvidia-smi --query-gpu=name --format=csv are:

name
GRID V100DX-16Q

when running: nvidia_smi.getInstance().DeviceQuery()

i get:

Traceback (most recent call last):
  File "/home/e161081/git/comps/qalgdlinfra/QAlgDLInfra/sandbox/daniel/allegro_examples/allegro_related_scripts/gpu_debug.py", line 5, in <module>
    a = nvidia_smi.getInstance().DeviceQuery()
  File "/home/e161081/git/comps/qalgdlinfra/venv/lib/python3.8/site-packages/pynvml/smi.py", line 1886, in DeviceQuery
    brandName = NVSMI_BRAND_NAMES[nvmlDeviceGetBrand(handle)]
KeyError: 10
qwertAsc commented 3 years ago

Also having this error, after updating driver version to 460.73.01

Riebart commented 3 years ago

Can confirm that this is also an issue with RTX3080 Mobile GPUs, and A100 MIG partitions using a variety of drivers, tested through the official NVidia RAPIDS container images.

dhruvdcoder commented 3 years ago

Happens for RTX 8000 as well

>>> all_data= nvsmi.DeviceQuery()                                                                                                                                                                                                                                               
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".local/lib/python3.8/site-packages/pynvml/smi.py", line 2041, in DeviceQuery
    brandName = NVSMI_BRAND_NAMES[nvmlDeviceGetBrand(handle)]

KeyError: 12
kenhester commented 2 years ago

NVSMI_BRAND_NAMES updated to include additional brand names.