Weizhi-Zhong / IP_LAP

CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors
Apache License 2.0
679 stars 76 forks source link

lip_embedding and jaw_embedding #54

Open Aditya870 opened 7 months ago

Aditya870 commented 7 months ago

How to know N_l:N_l+T is lip_embedding and N_l+T: is jaw_embedding. As used in code below. I am using more no of landmark points. so i need to know how you are getting this information. The code is attached below:

#3. fuse embedding
output_tokens=self.fusion_transformer(ref_embedding,mel_embedding,pose_embedding)

#4.output  landmark
**lip_embedding=output_tokens[:,N_l:N_l+T,:] #(B,T,dim)
jaw_embedding=output_tokens[:,N_l+T:,:] #(B,T,dim)**
output_mouse_landmark=self.mouse_keypoint_map(lip_embedding)  ##(B,T,40*2)
output_jaw_landmark=self.jaw_keypoint_map(jaw_embedding)   ##(B,T,17*2)