ZiqiaoPeng / SyncTalk

[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
https://ziqiaopeng.github.io/synctalk/
Other
1.07k stars 119 forks source link

--asr_model Hubert? #42

Closed kike-0304 closed 3 months ago

kike-0304 commented 3 months ago
  1. 我在用hubert来重新训练May数据,得到的人物嘴巴抖动很快,使用Hubert来训练需要更改其他的设置吗?
  2. 为什么hubert的self.audio_in_dim = 27,在er-nerf中是1024? if 'esperanto' in self.opt.asr_model: self.audio_in_dim = 44 elif 'deepspeech' in self.opt.asr_model: self.audio_in_dim = 29 elif 'hubert' in self.opt.asr_model: self.audio_in_dim = 27 else: self.audio_in_dim = 32
ZiqiaoPeng commented 3 months ago
  1. 我使用Hubert进行训练没有问题,可以再尝试一次。
  2. 请把27改成1024,之前有一些新的尝试所以改了这部分的代码。