Open gdet opened 4 years ago
Hello,
I followed the steps of your article and I have install pytorch with Cuda like this
pip3 install torch torchvision
I have python 3.7, torch 1.1.0 , ubuntu 18.04. When I am trying to run this command
python -m torch.distributed.launch --nproc_per_node=8 ./train.py
I get this error
WARNING:./train.py:Running process 2 THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1573049306803/work/torch/csrc/cuda/Module.cpp line=37 error=101 : invalid device ordinal Traceback (most recent call last): File "./train.py", line 267, in <module> train() File "./train.py", line 147, in train torch.cuda.set_device(args.local_rank) File "/home/hatzimin/.conda/envs/maria_env/lib/python3.7/site-packages/torch/cuda/__init__.py", line 300, in set_device torch._C._cuda_setDevice(device).
I searched the error but I haven't managed to find a solution. If I try to run python ./train.py I get no error.
Thank you
How many GPU do you have on your machine? You need nproc_per_node= number of GPU on your machine.
I have four. I had changed the number from 8 to 4 but one of them was already used so I got this error. Thank you!
Hello,
I followed the steps of your article and I have install pytorch with Cuda like this
I have python 3.7, torch 1.1.0 , ubuntu 18.04. When I am trying to run this command
I get this error
I searched the error but I haven't managed to find a solution. If I try to run python ./train.py I get no error.
Thank you