numz / sd-wav2lip-uhq

Wav2Lip UHQ extension for Automatic1111
Apache License 2.0
1.19k stars 161 forks source link

mouth movement unnatural, dark batches/shadows appear around lips #56

Open trxsudo opened 10 months ago

trxsudo commented 10 months ago

image the dark spots/shadows can be seen more noticeably when in motion.

what parameters should I adjust to make it better? i'm using default settings:

image

numz commented 10 months ago

Try to set face erode to something like 40 ,mask blur to 30 and use gfpgan, Let me know

trxsudo commented 10 months ago

Try to set face erode to something like 40 ,mask blur to 30 and use gfpgan, Let me know

just applied your suggested parameters, and the result is kind of off, here is the result video: where the teeth seem crooked and over-crowded. not sure if it is because the mask isnt covering the teeth in the original video https://github.com/numz/sd-wav2lip-uhq/assets/76946041/0a87d151-20e6-4ace-a68d-1bb20b7da032

numz commented 10 months ago

active debug and look into debug folder to check mask composition and see if it missed something or if erode or mask blur is not good enough

trxsudo commented 10 months ago

active debug and look into debug folder to check mask composition and see if it missed something or if erode or mask blur is not good enough

e944c89cb509df6e627d0a27c1f8f97

a7a7ca413fd41b7c5ced07fd0fc4dff

the upper one is the "restored face video" and the bottom one is the "generated video". the "restored face video" has overall better looking eyes but only problem is the rectangular shape which appears around the mouth and it cuts off the tip of the chin a bit; however, the "generated video" looks closer to the original video, but the dark batches and shadows around the mouth makes it unpleasant and unnatural. i would prefer the "restored face video" without the rectangular shape.

here are the "restored face video" and the "generated video" for a clearer view of what I meant:

https://github.com/numz/sd-wav2lip-uhq/assets/76946041/132b08c1-184f-41aa-beae-36cb29e34f50

https://github.com/numz/sd-wav2lip-uhq/assets/76946041/9136d7b9-1c07-46b8-a403-fc3be14bcf91

here is the setting im using now: image

numz commented 10 months ago

You can try "only mouth" option, mouton mask dilate something like 30 and "mask blur" between 30 and 60. Let me know