bigmb / Unet-Segmentation-Pytorch-Nest-of-Unets

Implementation of different kinds of Unet Models for Image Segmentation - Unet , RCNN-Unet, Attention Unet, RCNN-Attention Unet, Nested Unet
MIT License
1.87k stars 345 forks source link

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED #8

Closed Yu-Yuanyuan closed 5 years ago

Yu-Yuanyuan commented 5 years ago

I use my images to train, then show this error . I'm a beginner, could you give me some suggestion?

pytorch==1.1.0 cuda==9.0 cudnn==7.6.0 python==3.6.8

================================================================ Total params: 34,527,041 Trainable params: 34,527,041 Non-trainable params: 0

Input size (MB): 0.19 Forward/backward pass size (MB): 307.38 Params size (MB): 131.71 Estimated Total Size (MB): 439.27

Successfully created the main directory './model' Successfully created the prediction directory './model/pred' of dice loss Successfully created the model directory './model/Unet_D_15_4' /home/pcsk/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last):

File "", line 1, in runfile('/home/pcsk/myfile/haima/pytorch_run.py', wdir='/home/pcsk/myfile/haima')

File "/home/pcsk/anaconda3/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile execfile(filename, namespace)

File "/home/pcsk/anaconda3/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/pcsk/myfile/haima/pytorch_run.py", line 270, in lossT.backward()

File "/home/pcsk/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph)

File "/home/pcsk/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

bigmb commented 5 years ago

There could be a number of reason for the cuDNN error. 1) Is the input image size big? with the batch size of 4 is your GPU able to handle the data? (watch nvidia-smi during running the program) What is the size of the input image? 2) Did you upgrade your Cuda or Pytorch? then you might have to install them again.