yerfor / GeneFace

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
MIT License
2.51k stars 294 forks source link

Renderer Issue #96

Closed G-1nOnly closed 1 year ago

G-1nOnly commented 1 year ago

Hello ! Thanks for such a great work and congratulations on being accepted by ICLR2023.

I've run your code for a customized dataset and found out that the lip does not match the audio as well as the demo video. I have visualized the landmark and found out that the landmark matches the audio, so I think it is a renderer issue. I'm currently using the radnerf renderer, do you have any suggestions on solving this issue?

Thanks in advance!

yerfor commented 1 year ago

hi,how about the video length? It may need 3-5 minutes to learn a good tenderer. By the way, a learning curve about your head_nerf may help me find out the problem.

G-1nOnly commented 1 year ago

Thanks, the video length is over 4 minutes, and the learning curve is provided as follows.

1 2 3

Thanks in advance!

yerfor commented 1 year ago

Hi, it seems that the adversarial training of postnet has failed (the confidence of discriminator to the generate sample is only around 0.3, which ideally should be 0.5). I have two suggestions:

1) try lm3d_vae_sync_pitch and lm3d_postnet_sync_pitch. Because we found pitch is a useful hint for the postnet to successfully perform the domain adaptation.

2) reduce the number of layers in discriminator, which may ease the adversarial training.

G-1nOnly commented 1 year ago

Thanks, I would try to modify it to see if it works.

yerfor commented 1 year ago

Hi, I notice that you closed this issuse as completed. Can you feedback is you problem solved? And which method helps you out?

G-1nOnly commented 1 year ago

Yes, the results improve by applying the first method but I'm still trying the second one to see if it makes it better, thanks for the advice.