williamyang1991 / StyleGANEX

[ICCV 2023] StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Other
503 stars 35 forks source link

Model Inversion artifacts #30

Open meissnerA opened 2 days ago

meissnerA commented 2 days ago

First of all, thank you very much for your great work. I was testing the different encoder models and found it weird, that styleganex_edit_age.pt and latent_vector_w_blond_hair rendered the input image almost perfectly, even when I used selfies. Since StyleGAN should not have the features to create human hands or different objects its kind of strange that the produced image contains everything. Do you have any intuition, how StyleGAN learned the features to produce objects, which are not included in the FFHQ-Dataset?

When I use styleganex_inversion.pt instead, the encoder is only able to create images which are similar to StyleGAN-images (faces with blurry backgrounds). The second the clothing is more complicated the model does not seem to able to encode it. Am I doing something wrong or why is the encoding of edit_age and blond_hair so much better? Sadly adding attribute vectors to the encoding provided by edit_age does not work that well. Is there a way to get the best of both worlds? Encoding images with the quality of styleganex_edit_age (without the need for latent optimization) and being able to add w+-Vectors like with the inversion model? Here is an example of the inversion output: Screenshot from 2
![Screenshot from 2024-10-16 10-17-55](https://github.com/user-attachments/assets/73974adc-8920-4836-9f48-871c448a67fe)
024-10-16 10-18-26 Screenshot from 2024-10-16 10-18-07

williamyang1991 commented 8 hours ago

Yes, you are right.

If you want to get the best of both worlds, you need to train your own styleganex_edit-X with your w+-Vectors.