yerfor / GeneFace

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
MIT License
2.43k stars 290 forks source link

Audio - Lip Misalignment | 音频嘴形未对齐 #234

Open Qifeng-Wu99 opened 7 months ago

Qifeng-Wu99 commented 7 months ago

Appreciate the fantastic job done by the authors.

As I am trying to reproduce the result in the demo released by the authors, I train the sync net, audio to motion generator, post net and rad nerf from scratch on my own with the hyper-parameters released with the code.

However, the lip alighnment with the audio in my result is not satisfactory when compared to that obtained by the authors.

I wonder if there is some trick to tune/refine the hyper parameters to achieve better results.

Thanks in advance.


在尝试复现作者发布的演示结果时,我使用随代码一起发布的超参数,从头开始训练syncnet、audio2motion generator、postnet和rad nerf。




yerfor commented 7 months ago

Hi Qifeng, I suspect it might be the problem of selecting a appropriate checkpoint of the postnet. Maybe you can refer to this doc and this figure. Also, we plan to release GeneFace++ in Feb. 2024, which could well handle the challenge to hand-pick the postnet

Theweekfoolish229 commented 6 months ago

Appreciate the fantastic job done by the authors.

As I am trying to reproduce the result in the demo released by the authors, I train the sync net, audio to motion generator, post net and rad nerf from scratch on my own with the hyper-parameters released with the code.

However, the lip alighnment with the audio in my result is not satisfactory when compared to that obtained by the authors.

I wonder if there is some trick to tune/refine the hyper parameters to achieve better results.

Thanks in advance.


在尝试复现作者发布的演示结果时,我使用随代码一起发布的超参数,从头开始训练syncnet、audio2motion generator、postnet和rad nerf。




您好,这个对齐的问题您找到了吗?我在用其他视频进行syncnet、audio2motion generator、postnet和rad nerf训练也发现训练出来的模型嘴形对不齐。我在考虑在用Hubert提取特征时候是不是对应不同语种使用不同的hubert