MStypulkowski / diffused-heads

Official repository for Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
Other
463 stars 31 forks source link

about train script #25

Closed zhang-haojie closed 5 months ago

zhang-haojie commented 5 months ago

Dear auther, how to set param landmarks and audio_emb_dir and their corresponding data? Could you provide more information about them?

johndpope commented 5 months ago

you may find AniPortrait more relevant - they enhanced MediaPipe lip detection - and landmark detection is SOTA. https://github.com/Zejun-Yang/AniPortrait/blob/cb86caa741d6ab1e119ea7ac2554eb28aabc631b/src/utils/face_landmark.py#L123

MStypulkowski commented 5 months ago

For landmarks we used face-alignment. But feel free to use anything for that.

audio_emb_dir is a directory containing audio embeddings. The folder structure should be the same as data's (as indicated by file list). You can find information on how to compute them in README.

zhang-haojie commented 5 months ago

The landmarks is 3D facial landmarks or 2D facial landmarks? Thank you

MStypulkowski commented 5 months ago

They are 2D.