Difference between mean_pts3d.npy and std_mean_pts3d

YuanxunLu / LiveSpeechPortraits

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

MIT License

1.16k stars 198 forks source link

Difference between mean_pts3d.npy and std_mean_pts3d #61

Open T5eng opened 2 years ago

T5eng commented 2 years ago

Hi, Lu. I am very grateful for your paper and eval code. I encountered a question when trying to implement the training code.

https://github.com/YuanxunLu/LiveSpeechPortraits/blob/1529d9a2a1475ca918a75b7a2030ec7db6185ccd/demo.py#L81

https://github.com/YuanxunLu/LiveSpeechPortraits/blob/1529d9a2a1475ca918a75b7a2030ec7db6185ccd/demo.py#L87

in line 81, variable mean_pts3d is loaded from a npy file and in line 87 another variable std_mean_pts3d is defined by calculated mean of all detected ptd3d landmarks. I visualized these 2 variables and find them almost the same except the first 16 dimensions.

Question: what's the difference between these 2? Are they supposed to be the same ? if not, how to acquire mean_pts3d.npy?

foocker commented 2 years ago

hi, which 3d face tracking model are you choose?

T5eng commented 2 years ago

hi, which 3d face tracking model are you choose?

I think any 3d landmark detector will work.

YuanxunLu commented 2 years ago

The 'mean_pts' and 'std_mean_pts' may have differences in the first 16 dimensions, which are actually contour points. The 3D tracking algorithm I used applied sliding contour points for higher tracking accuracy, which is a commonly used trick in the 3D face reconstruction field. As a result, the contour points change for each frame due to the variation of the head pose & camera poses.

In my experiments, I replace the sliding contours with fixed contours to train the Renderer. So, I need to use fixed contours for inference, which is in line with the training settings.

For anyone who may not encounter such a situation, these two should be exactly the same.

Sorry, it's hard for me to remember all the details of the codes, so I checked the codes & data again and try my best to answer your questions. Hope the above helps!