yuval-alaluf / SAM

Official Implementation for "Only a Matter of Style: Age Transformation Using a Style-Based Regression Model" (SIGGRAPH 2021) https://arxiv.org/abs/2102.02754
https://yuval-alaluf.github.io/SAM/
MIT License
632 stars 151 forks source link

About input images #45

Closed bo775 closed 2 years ago

bo775 commented 2 years ago

Thanks for your great work. I would like to ask some questions.

For the training, do input facial images need to be aligned and cropped before they are input the network? For example, according to FFHQ dataset, there are in-the-wild-images (955GB) version and Aligned and Cropped Images1024x1024 (89.1GB) version. Which version did you download and resize its images to 256x256 input images in your paper?

As for the testing and your demo, is alignment and cropping pre-processing also indispensable? Otherwise it will affect the final performance?

Thanks.

yuval-alaluf commented 2 years ago

do input facial images need to be aligned and cropped before they are input the network?

Correct. All images must be aligned before being passed to the input. We use the Aligned and Cropped Images1024x1024 version which is resized to 256x256 before being fed to the network. Note that the input images are resized to 256 but the output images are still at resolution 1024x1024. The resize is simply done to speed up training (among other reasons).

As for the testing and your demo, is alignment and cropping pre-processing also indispensable? Otherwise it will affect the final performance?

Correct. Whether we are talking about training or inference, all input images must be aligned before being passed to the network.

bo775 commented 2 years ago

Thanks for your help.