yuval-alaluf / restyle-encoder

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" (ICCV 2021) https://arxiv.org/abs/2104.02699
https://yuval-alaluf.github.io/restyle-encoder/
MIT License
1.03k stars 154 forks source link

Is the training weight valid only for the training set? #37

Closed sky-fly97 closed 3 years ago

sky-fly97 commented 3 years ago

Hello, when I used theinference_iterative.py base on restyle_e4e_ffhq_encode.pt, I found that it was only effective for the images in the training set such as ffhq, but not good for some other faces.

sky-fly97 commented 3 years ago

image image

image image

image image

sky-fly97 commented 3 years ago

Does this mean that I need to retrain the restyle model? In addition, I want to know whether the resolution has a great impact on the results. For example, my data is 256~512 size. Can I resize it to 1024 for training and resize it to 1024 for inference, so that there is no need to retrain a 256 styleGAN.

yuval-alaluf commented 3 years ago

To fix your problem all you need to do is align your data before running inference. We provide a script for doing this. Regarding the image sizes. We trained our encoders using input images of size 256, even though the output resolution was 1024. So the input resolution does not need to match the output resolution, which can be higher.

sky-fly97 commented 3 years ago

To fix your problem all you need to do is align your data before running inference. We provide a script for doing this. Regarding the image sizes. We trained our encoders using input images of size 256, even though the output resolution was 1024. So the input resolution does not need to match the output resolution, which can be higher.

According to this method, I try again, and the effect is better, but there are still some gaps compared with the original picture. Maybe I need to train it on my data. image image

image image

sky-fly97 commented 3 years ago

Hello, thanks for your help, when I retrain restyle with my data, do I also need to align the data? There are about 3000 number of 256 * 256 pictures. How many steps do you need to train? Is this train log normal? image

yuval-alaluf commented 3 years ago

I still believe you either have an issue with your data and/or the parameters you used for inference. If you can send over the command you used, I can take a look. However, if you want to train a new encoder (which I don't think you need), to answer your questions:

sky-fly97 commented 3 years ago

I still believe you either have an issue with your data and/or the parameters you used for inference. If you can send over the command you used, I can take a look. However, if you want to train a new encoder (which I don't think you need), to answer your questions:

  • Yes. When training the encoder, you need to align your data.
  • You should train until convergence. I don't know how many steps it could take.
  • Yes. The training logs are normal.

Thanks, I just put my data through align_faces_parallel.py, then run python scripts/inference_iterative.py --exp_dir= "results_output" --checkpoint_path= 'ckpt/restyle_e4e_ffhq_encode.pt' --data_path= "images" --test_batch_size=4 --test_workers=4 --n_iters_per_batch=5 I follow the instructions, and there is the ori images: 134212_2 134212_1 4

yuval-alaluf commented 3 years ago

I just had a chance to run the images through the notebook and these are the results I got: test2 test1

While I agree with you that the results aren't necessarily great, these are two very difficult images and I believe most inversion techniques will struggle with these images. Therefore, I do not think retraining ReStyle will solve your issue here. You could instead try performing optimization to get the inversion, which will take you about several minutes per image but should get you better reconstructions.

As for the third image, for some reason, the alignment stage is failing here. Since this is a pre-trained model used for alignment, there is not much I can do to help here.

If you have any questions feel free to reopen the issue. To verify that the model does work, you could try other less challenging images. I have had good success with many images that were not in the training set.

sky-fly97 commented 3 years ago

I just had a chance to run the images through the notebook and these are the results I got: test2 test1

While I agree with you that the results aren't necessarily great, these are two very difficult images and I believe most inversion techniques will struggle with these images. Therefore, I do not think retraining ReStyle will solve your issue here. You could instead try performing optimization to get the inversion, which will take you about several minutes per image but should get you better reconstructions.

As for the third image, for some reason, the alignment stage is failing here. Since this is a pre-trained model used for alignment, there is not much I can do to help here.

If you have any questions feel free to reopen the issue. To verify that the model does work, you could try other less challenging images. I have had good success with many images that were not in the training set.

Thanks! I noticed that the results you got seems to be better than mine. Do you also use the command I mentioned? Use restyle e4e ffhq_ Encode.pt and perform 5 iterations per images.

Another question is, does perform optimization you mentioned mean increasing the number of iterations?

yuval-alaluf commented 3 years ago

I used the ReStyle-pSp encoder since that will more often than not get you better reconstructions. I performed inference using 5 iterations per image. If you perform optimization instead of using an encoder, you will need to run many iterations --- probably around 500 to 1500, which should take a minute or two.