RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

wangxiaoshuai223 commented 4 years ago

Sorry for troubling you. When I run python train_meta.py cfg/metayolo.data cfg/darknet_dynamic.cfg cfg/reweighting_net.cfg darknet19_448.conv.23,a runtimeerror occured: Traceback (most recent call last): File "train_meta.py", line 325, in train(epoch) File "train_meta.py", line 218, in train output = model(data, metax, mask) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, kwargs) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 71, in forward return self.module(*inputs[0], *kwargs[0]) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(input, kwargs) File "/home/m/Fewshot_Detection-master/darknet_meta.py", line 199, in forward dynamic_weights = self.meta_forward(metax, mask) File "/home/m/Fewshot_Detection-master/darknet_meta.py", line 122, in meta_forward metax = model(metax) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, *kwargs) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(input, **kwargs) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 282, in forward self.padding, self.dilation, self.groups) File "/home/m/.local/lib/python2.7/site-packages/torch/nn/functional.py", line 90, in conv2d return f(input, weight, bias) RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

whsun21 commented 4 years ago

same as me, and when I set torch.backends.cudnn.enabled = False in the front of train_meta.py, it return another error at the same place in the code, as follwing. Have you solved this error, if so, can you provide me some advises? Thanks a lot.

File "/home/sun/projects/Fewshot_Detection/darknet_meta.py", line 199, in forward dynamic_weights = self.meta_forward(metax, mask) File "/home/sun/projects/Fewshot_Detection/darknet_meta.py", line 122, in meta_forward metax = model(metax) File "/home/sun/anaconda3/envs/FR/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, *kwargs) File "/home/sun/anaconda3/envs/FR/lib/python2.7/site-packages/torch/nn/modules/container.py", line 67, in forward input = module(input) File "/home/sun/anaconda3/envs/FR/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(input, **kwargs) File "/home/sun/anaconda3/envs/FR/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 282, in forward self.padding, self.dilation, self.groups) File "/home/sun/anaconda3/envs/FR/lib/python2.7/site-packages/torch/nn/functional.py", line 90, in conv2d return f(input, weight, bias) RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1518238441757/work/torch/lib/THC/THCBlas.cu:247

linsongxue commented 3 years ago

minimize the batch size of .cfg file

bingykang / Fewshot_Detection

RuntimeError: CUDNN_STATUS_EXECUTION_FAILED #35