Open GabrielZZZ opened 2 years ago
Hi, I'm having the same problem. Did you come up with a solution?
I am afraid not. It appears the forum is not very active.
For me the solution was to change the number nproc_per_node in the training command from 8 to the number of GPUs my server has.
Hi,
I am using Ubuntu 20.04 with Nvidia RTX3090. When I followed the instructions to train the model, it always gives me this error:
File "/opt/conda/lib/python3.8/site-packages/pynvml/nvml.py", line 366, in check_return raise NVMLError(ret) pynvml.nvml.NVMLError_InvalidArgument: Invalid Argument
Does anyone know any possible solutions? That would be very helpful.