NVlabs / imaginaire

NVIDIA's Deep Imagination Team's PyTorch Library
Other
4.02k stars 452 forks source link

NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument #168

Open GabrielZZZ opened 2 years ago

GabrielZZZ commented 2 years ago

Hi,

I am using Ubuntu 20.04 with Nvidia RTX3090. When I followed the instructions to train the model, it always gives me this error: File "/opt/conda/lib/python3.8/site-packages/pynvml/nvml.py", line 366, in check_return raise NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument Does anyone know any possible solutions? That would be very helpful. image

Gienapp commented 2 years ago

Hi, I'm having the same problem. Did you come up with a solution?

GabrielZZZ commented 2 years ago

I am afraid not. It appears the forum is not very active.

Gienapp commented 2 years ago

For me the solution was to change the number nproc_per_node in the training command from 8 to the number of GPUs my server has.