eladrich / pixel2style2pixel

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework
https://eladrich.github.io/pixel2style2pixel/
MIT License
3.19k stars 570 forks source link

Asking for more details about training synthetic data #101

Closed markduon closed 3 years ago

markduon commented 3 years ago

Sorry sir! I come from #96 to make more clear about your advices: I have some more questions to make more clear (all our synthetic data have blue background):

1) You said in the final line that "Basically you can set your input and target into the synthetic data and using the FFHQ StyleGAN will try to encode the synthetic data into more realistic faces". As I understand, you mean source_train = target_train = 'synthetic data'. Change EncodeTransform 'transform_source' similar to 'transform_gt_train'. And then using pretrained generator stylegan2-ffhq-config.pt. Am I right, sir? Or I set source_train = 'synthetic', target_train = 'ffhq'?

2) I am also confused that if my data has many duplicates and low-detailed images (blur, censor effects), does it affect much on outputs? I got faces that are not identical to originals.

3) If I want to preserve about gender and age, should I play with losses lambda or play with latent vector? I mean these are specific features that are hard to handle. Or maybe we can add a loss function like age_loss or gender_loss?

4) Should I resume to fine_tune my synthetic data in repo StyleGAN2 from pretrained stylegan2-ffhq-config.pt and then use that pretrained model for generator in pSp with source_train = 'synthetic_data', target_train = 'ffhq'?

I know there are many questions from me. I hope you could give me some feedbacks. Thank you, sir!

yuval-alaluf commented 3 years ago

Hi @duongquangvinh , Regarding your questions:

  1. If I understood correctly, I think you will want to try setting both the source and targets to your synthetic data and use the FFHQ pre-trained model. As mentioned in your original post, I am not entirely sure what problem you're trying to solve, so you will most likely need to play around with a few things to get things to work.
  2. If you say that the outputs are not identical to the originals, this could be because of low-detailed data as you mentioned or it could be because of other things such balancing the loss weights. Again, it is difficult to say what the problem could be.
  3. Regarding preserving age or gender, if you have a network that can be used as an age loss or gender loss, that would be an interesting direction to try. Playing with the latent vector may be a bit tricky as these attributes are typically entangled with other attributes.
  4. If I remember correctly, you're cool is to translate synthetic data to real data. Therefore, you will want to use the original FFHQ StyleGAN for generating images. Therefore, at first glance I don't think you'll want to fine-tune StyleGAN on your synthetic data since by doing so, the generator will start outputting synthetic-like images rather than the real-like images you want. I hope I was able to answer your questions as best as possible.
markduon commented 3 years ago

thank you sir, you are so nice to answer so many questions from me. Have a good day, sir!