Open wangq95 opened 4 years ago
One possibility can be that you have less number of visible/available GPUs than the configured number of GPUs to use. i.e
In the run_local.sh file (in branch pytorch-1.1)
--nproc_per_node=4
specifies to use 4 GPUs for training
but if you have
GPU_IDS=0,1
only 2 GPUs are made visible
This may cause an error.
Hi, @speedinghzl , I try to train on Cityscapes dataset using pytorch 0.4.0, but I got an error as follows:
Traceback (most recent call last): File "train.py", line 253, in
main()
File "train.py", line 217, in main
preds = model(images, args.recurrence)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 112, in forward
return self.module(*inputs[0], *kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(input, kwargs)
File "/userhome/segmentation/CCNet/networks/ccnet.py", line 196, in forward
x = self.relu1(self.bn1(self.conv1(x)))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDNN_STATUS_MAPPING_ERROR
Could you give me some advise? Thanks