Open franciscorubin opened 5 years ago
This might be a cudnn issue, especially if you're using cudnn 7.2. Try
>>> import torch
>>> torch.backends.cudnn.version()
Upgrading your cudnn version may fix it: https://github.com/NVIDIA/apex/issues/78#issuecomment-440301134
Container options are
docker pull pytorch/pytorch:nightly-devel-cuda10.0-cudnn7
, (in which you can install Apex yourself with the usual git clone
, python setup.py install --cuda_ext --cpp_ext
).I tried updating and unfortunately the error persists. The command you mentioned outputs 7401.
just having a similar issue : ` 318 def forward(self, input): 319 return F.conv2d(input, self.weight, self.bias, self.stride, --> 320 self.padding, self.dilation, self.groups) 321 322
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED`
I'm running on windows 10 using cudnn 7.4.2 + cuda 10 . Are the others having this problem running on windows or on linux ?
P.s. : i am using an NVIDIA TITAN RTX
@njean78 I am running Linux Ubuntu 16.04, so it looks like the error is os-independent.
solved my issue by installing pytorch for cuda 10 (got it from https://pytorch.org/). I was probably using the one for cuda 9...
I tried updating and unfortunately the error persists. The command you mentioned outputs 7401.
@pancho111203 Since you've got cuda 10 on bare metal (meaning your system has the cuda 10 driver) you should be using Pytorch for cuda 10. When you say "I tried updating" do you mean you only updated cudnn, or did you try running in one of the cuda 10 containers I mentioned?
if you runing pytorch in docker, you shuld know that: https://github.com/NVIDIA/tacotron2/issues/109
Still having this problem
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
Can anyone give me some help? Thanks a lot!
Actually this will happen on gpu card 3, and it'll be fine on the other gpu cards.
I only use 1 gpu every time
@zhixuanli Which GPUs are you using and do you have a reproducible code snippet?
Was apex
installed successfully?
check the file path. It worked for me.
I get the following error every time I try to do a forward call with apex:
CUDNN logs: https://gist.github.com/pancho111203/3e91f0b46ab0be3b04f1edc9c1405684