Markfryazino / wav2lip-hq

Extension of Wav2Lip repository for processing high-quality videos.
534 stars 236 forks source link

the result like this #2

Open 425776024 opened 3 years ago

425776024 commented 3 years ago

the result like this

image

Why it looks so different from you:

image

MAYBreath commented 3 years ago

me too.. my result is horrible..

raw: 22 after: 11

I am use your guys Google Colab demo

Markfryazino commented 3 years ago

Why it looks so different from you:

The quality can decrease if the speech you are using for inference is way different from the data from the training set, which included a calm speech in the Russian language. Also, using another model can help. For instance, ESRGAN available via this link was finetuned on the video of the particular person you are applying the model to. Using it instead of the default model provided in Google Colab notebook may increase the quality.

Markfryazino commented 3 years ago

me too.. my result is horrible..

Unfortunately, as it is stated in the readme, the training set didn't contain enough data, so the model is not able to generalize well. The videos in the training set looked different from the screenshot you have shared: for instance, all of them had a white background, whilst the background of your photo is of another color. To obtain good results, please finetune the model.

MAYBreath commented 3 years ago

me too.. my result is horrible..

Unfortunately, as it is stated in the readme, the training set didn't contain enough data, so the model is not able to generalize well. The videos in the training set looked different from the screenshot you have shared: for instance, all of them had a white background, whilst the background of your photo is of another color. To obtain good results, please finetune the model.

thank you for reply,I got it.

andyvha commented 3 years ago

I use video with white background. The quality of lip-sync clip is not better. I use English audio.

Screenshot 2021-07-20 12 28 31