RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

aliwaqas333 commented 1 year ago

Namespace(batch_size=4, beta1=0.0, beta2=0.99, checkpoint_dir='expr/checkpoints', ds_iter=100000, eval_dir='expr/eval', eval_every=50000, f_lr=1e-06, hidden_dim=512, img_size=256, inp_dir='assets/representative/custom/crack', lambda_cyc=1, lambda_ds=1, lambda_reg=1, lambda_sty=1, latent_dim=16, lm_path='expr/checkpoints/celeba_lm_mean.npz', lr=0.0001, mode='train', num_domains=2, num_outs_per_domain=10, num_workers=4, out_dir='assets/representative/ourset/src/del', print_every=1, randcrop_prob=0.5, ref_dir='assets/representative/ourset/ref', result_dir='expr/results', resume_iter=0, sample_dir='expr/samples', sample_every=5000, save_every=10000, seed=777, src_dir='assets/representative/ourset/src', style_dim=64, total_iters=100000, train_img_dir='data/ourset/train', val_batch_size=32, val_img_dir='data/ourset/val', w_hpf=1, weight_decay=0.0001, wing_path='expr/checkpoints/wing.ckpt')
Device: cuda
Number of parameters of generator: 43467395
Number of parameters of mapping_network: 2438272
Number of parameters of style_encoder: 20916928
Number of parameters of discriminator: 20852290
Number of parameters of fan: 6333603
Initializing generator...
Initializing mapping_network...
Initializing style_encoder...
Initializing discriminator...
Preparing DataLoader to fetch source images during the training phase...
Preparing DataLoader to fetch reference images during the training phase...
Preparing DataLoader for the generation phase...
Start training...
/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/functional.py:2506: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
Traceback (most recent call last):
  File "main.py", line 184, in <module>
    main(args)
  File "main.py", line 61, in main
    solver.train(loaders)
  File "/mnt/d/<username>/core/solver.py", line 107, in train
    masks = nets.fan.get_heatmap(x_real) if args.w_hpf > 0 else None
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/mnt/d/<username>/core/wing.py", line 253, in get_heatmap
    outputs, _ = self(x_01)
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/<username>/core/wing.py", line 226, in forward
    x, _ = self.conv1(x)
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/<username>/core/wing.py", line 147, in forward
    ret = self.conv(ret)
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/home/user/miniconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

aliwaqas333 commented 1 year ago

installing PyTorch with cuda 11.6 in a fresh environment solved the problem.

aimeelina commented 1 year ago

Hello, I met the same error. I've tried conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge . However, it found to be incompatible with the python=3.6.7. May I ask about your version of python and pytorch?

yangjunzhenshishuai commented 1 year ago

Hello, I also have same error,have you sloved it ？If it is solved, can you tell me how to do it，thank you @aimeelina

aimeelina commented 1 year ago

You seem to be Chinese, so I won't translate the answer into English. 我刚刚查了一下我的环境，python=3.6.7，pytorch=1.4.0，因此我当时输入的下载命令应该是conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch，我的cuda版本是10.1，我感觉关键是cudatoolkit的版本要和自己的显卡的版本对应，因为这个代码的python版本很低，我就在pytorch官网里找匹配我的cuda版本的最低版本pytorch的下载命令。

------------------ 原始邮件 ------------------ 发件人: "clovaai/stargan-v2" @.>; 发送时间: 2023年1月5日(星期四) 中午1:50 @.>; @.**@.>; 主题: Re: [clovaai/stargan-v2] RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED (Issue #145)

Hello, I also have same error,have you sloved it ？If it is solved, can you tell me how to do it，thank you @aimeelina

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

clovaai / stargan-v2

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED #145