Closed otoky closed 9 months ago
Invalid device ordinal means it is trying to set the device number to a gpu that is not on the machine I think. Try changing device_num to 0.
Otherwise there might be an issue with torch install.
i think i got it to work thanks!
Hi! I followed the tutorial until the train-saturn section- I am using google colab and have pip imported all the variables and am running on a GPU enabled virtual machine. (!pip install torch==1.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html)
When i run it, I am getting error: File "/gdrive/MyDrive/SATURN/files/train-saturn.py", line 1050, in
torch.cuda.set_device(args.device_num)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py", line 404, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.not very well versed in this stuff, but is there an issue with the code recognizing which GPU?