yuangan / EAT_code

Official code for ICCV 2023 paper: "Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation".
Other
269 stars 30 forks source link

Inquiry Source Data Preparation #32

Closed Calmepro777 closed 3 months ago

Calmepro777 commented 3 months ago

Thanks for the authors' wonderful work.

I would be grateful if authors address my questions regarding preparing source data for inference.

Here is my use case:

  1. I noticed that the source image is cropped first during inference, while I am wondering if the cropping is really necessary if the source image is already at size 256*256 and the person's face occupies a relatively large portion of the image
  2. If the cropping necessary, it seems that the default template in the codebase could lead to some wierd black area in the cropped image. I wonder where I could find other templates and how I could define a template myself.

I am enclosing the original and cropped image for reference: km km_cropped

Thanks in advance.

yuangan commented 3 months ago

Hi, thank you for your attention.

Actually, the template is not necessary. The template is used to achieve better driving results, as our model is trained on the preprocessed data (not restricted alignment though).

If you want to drive your image without a crop, you can achieve this by directly putting the source images into ./demo/imgs_cropped/. And then run as usual.

You can refer to here for more logic details.

If you have any more questions, feel free to contact us.

Calmepro777 commented 3 months ago

Thanks for the author's quick response, I am going to close this issue for my questions have been well addressed.