anothermartz / Easy-Wav2Lip

Colab for making Wav2Lip high quality and easy to use
513 stars 76 forks source link

How to keep silent #52

Open NewNoviceChen opened 3 months ago

NewNoviceChen commented 3 months ago

During the synthesis of video, I found that if the original video was not in the silent state, there would be a problem of mouth mismatch, but in the silent state, the effect is much better. I would like to ask how to match the mouth shape in the original video speaking state, or directly generate a silent video (I found that the effect of Wav2Lip trying to silence is not good), thank you very much

anothermartz commented 3 months ago

I'm having some difficulty understanding exactly what you mean, if you don't natively speak English, could you use a large laguage model such as chatgpt/copilot/gemini to translate from your native language?

I believe you're talking about how the mouth movements in the original video are hard to mask/suppress. Wav2Lip works better than Wav2Lip_GAN but it is still pretty bad.

I have had an idea to try to take the mouth first frame of the mouth/chin and try to lock its pose while still tracking it onto the face, slowly transitioning to another frame using optical flow blending. But I really doubt this could look convincing at all and would be a lot of work to get it working in the first place.

There's another lipsynching project that seems to suppress the face much better here:

https://github.com/natlamir/DINet

I may look into making this into an Easy-DINet project, then perhaps combining the two into one GUI where you can choose between or even layer them together, using DINet to suppress the original movements and then Wav2Lip to apply an accurate lipsync, but this will take a long time.

NewNoviceChen commented 3 months ago

Thank you for your response. My English isn't very good, but I feel like you've understood my meaning. May I ask another question? I'm considering training and generating a video using Dinet without mouth movement, then using that video for synthesis with Wav2lip. Would this approach be putting the cart before the horse?

skeletonNN commented 2 months ago

Can you solve it?