I tried your model and find something wrong. Here are the details:
1. when loading wav2lip.pth, "Missing key(s) in state_dict" occured. I made some changes in inference and if solved. Can you please check if it's all right?
def load_model(path):
model = Wav2Lip()
print("Load checkpoint from: {}".format(path))
checkpoint = _load(path)
s = checkpoint["state_dict"]
# new_s = {}
# for k, v in s.items():
# new_s[k.replace('module.', '')] = v
model.load_state_dict(checkpoint, False)
model = model.to(device)
return model.eval()
2. New error:
Using cuda for inference.
Reading video frames...
Number of frames available for inference: 128
(80, 377)
Length of mel chunks: 115
Recovering from OOM error; New batch size: 8 | 0/1 [00:00<?, ?it/s]
Load checkpoint from: checkpoints/wav2lip_gan.pth | 0/8 [00:00<?, ?it/s]
Model loaded####################################| 15/15 [00:23<00:00, 1.59s/it]
Traceback (most recent call last):
File "inference.py", line 373, in
main()
File "inference.py", line 354, in main
pred = model(mel_batch, img_batch)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, kwargs)
File "/workspace/wav2lip_288x288-ckpt-mismatch/models/wav2lipv2.py", line 117, in forward
x = f(x)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, *kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, kwargs)
File "/workspace/wav2lip_288x288-ckpt-mismatch/models/conv2.py", line 16, in forward
out = self.conv_block(x)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, *kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (2 x 2). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
Thanks for your sharing!
I tried your model and find something wrong. Here are the details:
1. when loading wav2lip.pth, "Missing key(s) in state_dict" occured. I made some changes in inference and if solved. Can you please check if it's all right? def load_model(path): model = Wav2Lip() print("Load checkpoint from: {}".format(path)) checkpoint = _load(path)
s = checkpoint["state_dict"]
2. New error: Using cuda for inference. Reading video frames... Number of frames available for inference: 128 (80, 377) Length of mel chunks: 115 Recovering from OOM error; New batch size: 8 | 0/1 [00:00<?, ?it/s] Load checkpoint from: checkpoints/wav2lip_gan.pth | 0/8 [00:00<?, ?it/s] Model loaded####################################| 15/15 [00:23<00:00, 1.59s/it]
Traceback (most recent call last): File "inference.py", line 373, in
main()
File "inference.py", line 354, in main
pred = model(mel_batch, img_batch)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, kwargs)
File "/workspace/wav2lip_288x288-ckpt-mismatch/models/wav2lipv2.py", line 117, in forward
x = f(x)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, *kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, kwargs)
File "/workspace/wav2lip_288x288-ckpt-mismatch/models/conv2.py", line 16, in forward
out = self.conv_block(x)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, *kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (2 x 2). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
Do you have any suggestions? Thanks in advance.