YuanxunLu / LiveSpeechPortraits

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
MIT License
1.16k stars 200 forks source link

对于特定训练数据,head motion和candidate image是否有必要? #11

Closed forest520 closed 2 years ago

forest520 commented 2 years ago

感谢大佬的优秀工作! 想请教下,如果训练数据是一个特别录制的视频,人物除了面部外,其他部位包括脖子、肩膀等基本不动,且背景是类似绿幕这种,upper body motion合成和candidate image set这两部分是否还有必要?谢谢!

YuanxunLu commented 2 years ago

If your data doesn't include head poses and upper body motions, e.g., MEAD, of course, predicting head pose is meaningless because actually there're no head poses exist in the dataset right (heads in all training frames are fixed)?

Meanwhile, if your dataset doesn't include shoulder motions (or on a very small scale), removing upper body motion features may work. But it also depends on how you cut and crop the face region, e.g., if all the training frames keep the shoulders almost in the same location, I think upper body motion is not needed because there is no ambiguity about the bottom part of the images, i.e., the shoulder keeps fixed. After all, the shoulder line is designed to remove the ambiguity in the training set (One could move his shoulder while keeping his head fixed).

Also, if the background of your dataset is something like a green screen and your camera is fixed, i.e., camera parameters are fixed for all training frames, I think the candidate image set could also be removed.

Finally, I think every design should be considered and checked in the experiments, right?

forest520 commented 2 years ago

谢谢Dr. Lu详细的解答! 再请教训练过程中,每个形象的3分钟视频,是使用的4个固定的candidate image么?还是根据时间窗口,4个candidate image会相应变动?

YuanxunLu commented 2 years ago

The number of the candidate images set of each video is not fixed. As mentioned above, if your data doesn't include changing camera parameters -- the influence of candidate images set is not obvious. For example, if one video contains 3 different camera parameters, I will select 3 different candidate image set during the training. During testing, you should keep the candidate images fixed.