Training with longer videos

Fictionarry / TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

https://fictionarry.github.io/TalkingGaussian/

243 stars 30 forks source link

Training with longer videos #49

Closed Turlan closed 5 days ago

Turlan commented 6 days ago

Hi, thanks for releasing the code! I want to use your code for academic comparison. But I have encountered some problems when I tried to follow your scripts. Let's say there are 1~2 hours of video clips segmented from a long video, most preprocessing steps are easy to handle. But for the last step, the 3DMM tracking part, it's too time-consuming and not suitable for this case. Any suggestions?

Turlan commented 6 days ago

BTW, if the training frames are not consecutive, (consist of several different clips), there would be no problem, right?

Fictionarry commented 6 days ago

Hi, I think hours of training video would not bring more obvious benefits than tens of minutes of video, as the latter is already sufficient to cover most phonemes. 20 minutes is fully enough in our private tests for most portraits. It is OK to use multiple clips.