I have generated syncnet weights by adding the ReLU layer at the end of the syncnet model.
I am using the weights to train the wav2lip generator and discriminator network. However, I am getting assertion error as follows:
Evaluating for 300 steps
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [3,0,0] Assertion input_val >= zero && input_val <= one failed.
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [5,0,0] Assertion input_val >= zero && input_val <= one failed.
../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [7,0,0] Assertion input_val >= zero && input_val <= one failed.
L1: 0.24381156265735626, Sync: 0.0, Percep: 0.7179633975028992 | Fake: 0.6689321398735046, Real: 0.7179633378982544: : 1it [00:07, 7.28s/it]
Traceback (most recent call last):
File "hq_wav2lip_train.py", line 442, in
nepochs=hparams.nepochs)
File "hq_wav2lip_train.py", line 286, in train
average_sync_loss = eval_model(test_data_loader, global_step, device, model, disc)
File "hq_wav2lip_train.py", line 326, in eval_model
perceptual_loss = disc.perceptual_forward(g)
File "/home/akool/shahid/wav2lip_288x288/models/wav2lipv2.py", line 196, in perceptual_forward
false_feats = f(false_feats)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, *kwargs)
File "/home/akool/shahid/wav2lip_288x288/models/conv2.py", line 30, in forward
out = self.conv_block(x)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(input, kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 460, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.
It looks like the error is only happening in the evaluation stage. Rest of the training remains fine.
I have generated syncnet weights by adding the ReLU layer at the end of the syncnet model. I am using the weights to train the wav2lip generator and discriminator network. However, I am getting assertion error as follows: Evaluating for 300 steps ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [3,0,0] Assertion
nepochs=hparams.nepochs)
File "hq_wav2lip_train.py", line 286, in train
average_sync_loss = eval_model(test_data_loader, global_step, device, model, disc)
File "hq_wav2lip_train.py", line 326, in eval_model
perceptual_loss = disc.perceptual_forward(g)
File "/home/akool/shahid/wav2lip_288x288/models/wav2lipv2.py", line 196, in perceptual_forward
false_feats = f(false_feats)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, *kwargs)
File "/home/akool/shahid/wav2lip_288x288/models/conv2.py", line 30, in forward
out = self.conv_block(x)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(input, kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/akool/anaconda3/envs/wav2lip2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 460, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.
input_val >= zero && input_val <= one
failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [5,0,0] Assertioninput_val >= zero && input_val <= one
failed. ../aten/src/ATen/native/cuda/Loss.cu:92: operator(): block: [0,0,0], thread: [7,0,0] Assertioninput_val >= zero && input_val <= one
failed. L1: 0.24381156265735626, Sync: 0.0, Percep: 0.7179633975028992 | Fake: 0.6689321398735046, Real: 0.7179633378982544: : 1it [00:07, 7.28s/it] Traceback (most recent call last): File "hq_wav2lip_train.py", line 442, inIt looks like the error is only happening in the evaluation stage. Rest of the training remains fine.
Can I get any help ?