Meta-Portrait / MetaPortrait

[CVPR 2023] MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
https://meta-portrait.github.io/
MIT License
536 stars 41 forks source link

How do i try on custom test dataset? #4

Closed jinwonkim93 closed 1 year ago

jinwonkim93 commented 1 year ago

Thanks for the great work! It works perfectly. the model seems to need some other stuff (ldmk, theta) how can I get it for custom test dataset?

ForeverFancy commented 1 year ago

Thank you for your kind words. I'm glad to hear that the work meets your expectations and works perfectly. Regarding your question about the ldmks for custom test dataset, the ldmks we used are predicted by a pre-trained face tracker from https://microsoft.github.io/DenseLandmarks/, which is maintained by another group in MSR and not publicly available at the moment. Therefore, an alternative way to run our model with custom dataset is to use public sparse ldmks. You can use any face landmark detector and connect the predicted ldmks using color lines like https://arxiv.org/abs/2011.04439. We will consider releasing a public available version if possible. I apologize for this inconvenience and hope you understand.

jinwonkim93 commented 1 year ago

Thank you!

m-pektas commented 1 year ago

Thanks for the great work @ForeverFancy !! You explained "ldmks" in your comment above, but what about thetas? How can I obtain it?

ForeverFancy commented 1 year ago

It's actually the transformation matrix used to align the face to the center. For example, you could refer to this blog, while we use 5 keypoints instead of 2 in the blog to align the face.

psurya1994 commented 1 year ago

@ForeverFancy Can you describe in more detail, what you mean by "connect the predicted ldmks using color lines"? I wasn't able to find anything related to "color lines" in the paper.

It seems to me like you're suggesting we train the landmark transformer from the paper, did I get that right?

Thraick commented 1 year ago

I followed the instruction and generate the imgs, ldmaks and thetas. What is the src_0_id.npy? And how can i generate it?

ForeverFancy commented 1 year ago

I followed the instruction and generate the imgs, ldmaks and thetas. What is the src_0_id.npy? And how can i generate it?

Hi, you could refer to #10 for more detail.

ForeverFancy commented 1 year ago

@ForeverFancy Can you describe in more detail, what you mean by "connect the predicted ldmks using color lines"? I wasn't able to find anything related to "color lines" in the paper.

It seems to me like you're suggesting we train the landmark transformer from the paper, did I get that right?

Hi, the code for connecting ldmks with color lines is here.

qiuyuzhao commented 1 year ago

in dataset.py src_ldmk_norm.shape is (58, 2), is this the key point of the human face? in face_alignment point of the human face is (68 ,2 )Does this affect the results?

liliya-imasheva commented 11 months ago

Did anyone manage to try it on a custom data set (custom source image+custom driving video)? I have created the landmarks (ldmk), transformation matrices (theta), facial embeddings (id), and connectivity.tsv, I've also tried it in different ways, but it didn't seem to produce acceptable results even close to the ones they published in the paper. The landmarks in the suggested by @ForeverFancy source are not as dense as the ones used in the paper, and I didn't find any open source that would produce similar landmarks. If someone knows how to make it work, please let me know, I would love some further directions where to look for the solution.

Hujiazeng commented 8 months ago

@liliya-imasheva Did your solution solve the problem?

Hujiazeng commented 8 months ago

I followed the instruction and generate the imgs, ldmaks and thetas. What is the src_0_id.npy? And how can i generate it?

Hi, Is the project working? Can you share the pipeline

liliya-imasheva commented 8 months ago

@liliya-imasheva Did your solution solve the problem?

no, I also tried denser landmarks, very similar to what they have in the paper, but it didn't help either. I ended up using another model, Thin Plate Spline Motion Model, https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model?tab=readme-ov-file, it gave rather good results.

Hujiazeng commented 8 months ago

@liliya-imasheva Did your solution solve the problem?

no, I also tried denser landmarks, very similar to what they have in the paper, but it didn't help either. I ended up using another model, Thin Plate Spline Motion Model, https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model?tab=readme-ov-file, it gave rather good results.

thank you! Did you use pre-trained face landmarks detector instead of Keypoint Detector in this work

liliya-imasheva commented 8 months ago

@Hujiazeng, for this one Keypoint Detector was working well, so I didn't try any other landmarks.

alasokolova commented 7 months ago

Hi, @liliya-imasheva Could you please explain how you computed theta?

liliya-imasheva commented 7 months ago

@alasokolova honestly I don't remember and can't find it right now, but I followed some information given here in the issues, something like here: https://github.com/Meta-Portrait/MetaPortrait/issues/4#issuecomment-1491164915, and I think there is more in other issues. But as I said, I wasn't able to get acceptable results with a custom dataset for this model, so maybe the way I computed them is not actually the best :D