NVlabs / imaginaire

NVIDIA's Deep Imagination Team's PyTorch Library
Other
3.99k stars 444 forks source link

NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument #168

Open GabrielZZZ opened 1 year ago

GabrielZZZ commented 1 year ago

Hi,

I am using Ubuntu 20.04 with Nvidia RTX3090. When I followed the instructions to train the model, it always gives me this error: File "/opt/conda/lib/python3.8/site-packages/pynvml/nvml.py", line 366, in check_return raise NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument Does anyone know any possible solutions? That would be very helpful. image

Gienapp commented 1 year ago

Hi, I'm having the same problem. Did you come up with a solution?

GabrielZZZ commented 1 year ago

I am afraid not. It appears the forum is not very active.

Gienapp commented 1 year ago

For me the solution was to change the number nproc_per_node in the training command from 8 to the number of GPUs my server has.