Sxjdwang / TalkLip

373 stars 34 forks source link

Severe Blur in the mouth area #36

Open Nyquist0 opened 7 months ago

Nyquist0 commented 7 months ago

Dear Sir or Madam,

Thanks for making this projects open-sourced. Appreciate that.

But I found I cannot get a make-sense result. In most times, there are severe blur in the mouth area. Like the following video shows.

https://github.com/Sxjdwang/TalkLip/assets/43435441/455d800b-31b2-40d5-9570-7e1793e7f101

I am assuming that it is because the number of reference identity input is only one. It must be open-mouth or close mouth. So in one single generation period, the network cannot get both open-mouth and close-mouth identity characteristic feature of the face, so it will lead to much blur.

Please correct me if I was wrong.

Sxjdwang commented 7 months ago

Could you provide the video and audio you employed to generate this video?

Nyquist0 commented 7 months ago

Sure. Attached zip includes source video, source audio, and generated video.

Thanks for helping checking that.

Archive.zip

Nyquist0 commented 6 months ago

Hi @Sxjdwang, I have tested more samples, but got bad effects too. I am considering that might be because of the gap between the training dataset and my testing data, which is in-the-wild.

Would you mind give me some advice to reduce that gap? Like face area resolution (although I think you will resize the cropped detected facial area)? And the testing video fps == 25 and audio data sample rate == 16khz.