alexandrosstergiou / SoftPool

[ICCV 2021] Code for approximated exponential maximum pooling
MIT License
288 stars 52 forks source link

RuntimeError: CUDA error: an illegal memory access was encountered #27

Closed hmm0852 closed 3 years ago

hmm0852 commented 3 years ago

Traceback (most recent call last): File "/media/mm/62f65e0e-b396-44ac-b1aa-b2fb260e70d1/mm/experiment/DE-Resnet18/main.py", line 251, in main(args, logger) File "/media/mm/62f65e0e-b396-44ac-b1aa-b2fb260e70d1/mm/experiment/DE-Resnet18/main.py", line 171, in main train(model, train_loader, optimizer, criterion, epoch, args, logger) File "/media/mm/62f65e0e-b396-44ac-b1aa-b2fb260e70d1/mm/experiment/DE-Resnet18/main.py", line 61, in train loss.backward() File "/home/mm/anaconda3/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/mm/anaconda3/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward allow_unreachable=True) # allow_unreachable flag File "/home/mm/anaconda3/lib/python3.7/site-packages/torch/autograd/function.py", line 89, in apply return self._forward_cls.backward(self, *args) # type: ignore File "/media/mm/62f65e0e-b396-44ac-b1aa-b2fb260e70d1/mm/experiment/DE-Resnet18/softpool.py", line 45, in backward saved[-1][torch.isnan(saved[-1])] = 0

My environment info is: GPU: Titan CUDA: 10.1.243 cuDNN: 7.6.5 PyTorch: 1.7.1

alexandrosstergiou commented 3 years ago

Hi @hmm0852 ,

You can comment out saved[-1][torch.isnan(saved[-1])] = 0 as it is mainly used for underflows and may throw an illegal memory access for some drivers.

Best, Alex

alexandrosstergiou commented 3 years ago

Will be closing this due to inactivity. Feel free to open a new issue in case of a different problem.

Best, Alex