KU-CVLAB / GaussianTalker

Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn and Seungryong Kim
Other
188 stars 21 forks source link

Question: Paste inferenced video back onto the original source video #2

Open schxnhxlz opened 2 months ago

schxnhxlz commented 2 months ago

Hey there,

first of all great project! I would love to try it out. Does it seemlesly work to put the generated video back on top of the raw video?

Like this:

image

To remain the real eye and forehead movement i want to mask out just the nose mouth and jaw:

image

Would be awesome if someone could answer me this before train an own video :) Thanks in advance!

joungbinlee commented 2 months ago

Thank you very much for using my project!

In inference, we use the ground truth torso image to generate only the face, so if you render with the camera position and eye vector identical to the training set, and change the audio to your desired data, it will connect well with the exterior of the previously cropped bbox image. Additionally, you could also attach only a specific mask you want through stitching.