primepake / wav2lip_288x288

MIT License
524 stars 135 forks source link

Inference Not working #103

Open shahidmuneer opened 6 months ago

shahidmuneer commented 6 months ago

I have trained the model with hq_wav2lip_sam_train.py, however, I am having issues with inference. The issue is from encoder decoder network of wav2lip_sam class. Error log is bellow:

size before torch.Size([128, 1024, 1, 1]) torch.Size([128, 1024, 3, 3]) size before torch.Size([128, 1024, 5, 5]) torch.Size([128, 1024, 5, 5]) size before torch.Size([128, 1024, 10, 10]) torch.Size([128, 512, 9, 9]) Error of size is in sam feat torch.Size([128, 1024, 10, 10]) torch.Size([128, 512, 9, 9]) 0%| | 0/2 [00:26<?, ?it/s] Traceback (most recent call last): File "inference.py", line 280, in main() File "inference.py", line 263, in main pred = model(mel_batch, img_batch) File "/home/akool/anaconda3/envs/Wav2Lip288/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/home/akool/shahid/wav2lip_288x288/models/sam.py", line 201, in forward raise e File "/home/akool/shahid/wav2lip_288x288/models/sam.py", line 194, in forward x = self.sam(feats[-1], x) File "/home/akool/anaconda3/envs/Wav2Lip288/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, *kwargs) File "/home/akool/shahid/wav2lip_288x288/models/sam.py", line 36, in forward out = sesp_att+se RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 3

Anyone solve this error quickly to save time in inference?

shahidmuneer commented 6 months ago

Changing the following from args.img_size = 288 to args.img_size = 384 in inference.py resolved the above issue.

ghost commented 6 months ago

any demo?

shahidmuneer commented 6 months ago

Since I am training wav2lip_sam and i have not get any optimal state yet. Model is on training, as soon as I have the demo, I will share with you. However, with some intermediate testing, I have seen that the generated data is much realistic and high quality.

Thanks for the amazing stuff.

Mrkomiljon commented 6 months ago

Since I am training wav2lip_sam and i have not get any optimal state yet. Model is on training, as soon as I have the demo, I will share with you. However, with some intermediate testing, I have seen that the generated data is much realistic and high quality.

Thanks for the amazing stuff. hi bro, can you explain training steps?

see2run commented 1 month ago

Since I am training wav2lip_sam and i have not get any optimal state yet. Model is on training, as soon as I have the demo, I will share with you. However, with some intermediate testing, I have seen that the generated data is much realistic and high quality.

Thanks for the amazing stuff.

hey, can you help me with this? #143 please

see2run commented 1 month ago

Changing the following from args.img_size = 288 to args.img_size = 384 in inference.py resolved the above issue.

I have done this, but still

piwawa commented 3 weeks ago

How many steps did you use to train the wav2lip model? How was the result?

I recently used this project for training, but the syncnet loss was always at 0.69. I directly used the unconverged syncnet model to train the wav2lip 384 model with 460k steps. The loss did not decrease except for Generator/l1_loss /train.

Here is the complete log:

screencapture-tb-hxliu-tech-2024-06-16-18_41_32