YuelangX / Gaussian-Head-Avatar

[CVPR 2024] Official repository for "Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians"
Other
703 stars 40 forks source link

Inquiry Regarding Reenactment Script Behavior Without Expression Smoothing #17

Open MaxtirError opened 2 months ago

MaxtirError commented 2 months ago

Dear Author,

Firstly, I would like to express my gratitude for sharing the code and insights through your paper. Your work has sparked great interest in my research, and I appreciate the opportunity to explore it further.

I have been experimenting with the method on a particular identity and have adhered closely to the instructions provided in the repository. For my current application, I elected to render images at a resolution of 512, opting not to train or utilize the super-resolution.

During the course of my implementation, I observed that the reenactment.py script applies a smoothing strategy to the expression codes during inference.

if not self.freeview:
    if idx > 0:
        data['pose'] = pose_last * 0

https://github.com/YuelangX/Gaussian-Head-Avatar/assets/75169739/0a0b6098-2cfd-488c-a6a8-efc8516b013d

.5 + data['pose'] * 0.5
        data['exp_coeff'] = exp_last * 0.5 + data['exp_coeff'] * 0.5
    pose_last = data['pose']
    exp_last = data['exp_coeff']

else:
    data['pose'] *= 0
    if idx > 0:
        data['exp_coeff'] = exp_last * 0.5 + data['exp_coeff'] * 0.5
    exp_last = data['exp_coeff']

Out of curiosity, I disabled this feature to examine its impact on the output.

https://github.com/YuelangX/Gaussian-Head-Avatar/assets/75169739/c48eb9d3-f99d-430d-b3be-c390ed8dc677

As a result, I noticed some flickering in the reenacted video, which was not present when the smoothing strategy was enabled.

Could you please confirm if such flickering is an expected behavior when the expression smoothing is not employed? Any insights or suggestions you could offer regarding this issue would be immensely valued.

YuelangX commented 2 months ago

Yes, the flickering is expected. Because the expression coefficients obtained by fitting the BFM model are flickering. Possible solutions: 1) Generate smooth expression coefficients. 2) Add noise to the express coefficients before feeding them to the network.

MaxtirError commented 2 months ago

Thank you very much for your kind reply. I have another question regarding the rendering results that I'd like to inquire about. I noticed in the video that there are expressions which are not exaggerated and seem easily transferrable, as illustrated in the pictures below.

image

However, the model performs poorly in such cases. Is this outcome to be expected with your method? If so, could you please explain the reason behind this unexpected behavior of the model? Any advice you could offer would be greatly appreciated.

jeb0813 commented 1 month ago

Hi @YuelangX , I checked the Dataset, it seems the ReenactmentDataset set pose to default in cfg.

if os.path.exists(cfg.pose_code_path):
    self.pose_code = torch.from_numpy(np.load(cfg.pose_code_path)['pose'][0]).float()
else:
    self.pose_code = None
if self.pose_code is not None:
    pose_code = self.pose_code
else:
    pose_code = pose

Is that means all reenactment params share the same pose code if designated?

YuelangX commented 1 month ago

@jeb0813 , specifying a fixed pose_code will make the body (neck) fixed relative to the head. If not, head and body might mismatch in cross-identity reenactment.