Hi,there!
When runing the pgcn_tset.py for inference, I encounter the cuda error and here is my stack trace:
model epoch 15 loss: 1.4765163376217796
File parsed. Time:4.10
Dict constructed. Time:4.39
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-2:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-3:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
0%| | 0/210 [00:00<?, ?it/s]THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-4:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
6%|██▍ | 13/210 [06:37<1:47:22, 32.70s/it]^CTraceback (most recent call last):
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 216, in <module>
rst = result_queue.get()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/queues.py", line 94, in get
res = self._recv_bytes()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
Process SpawnProcess-1:
6%|██▍ | 13/210 [06:44<1:42:07, 31.10s/it]
Process finished with exit code 1
Hi,there! When runing the pgcn_tset.py for inference, I encounter the cuda error and here is my stack trace:
I also test my cuda and it turns out TRUE:
I do not know how to fix this error. Could anyone help?