Why the latent code from GAN inversion methods can be manipulated by the boundary

genforce / interfacegan

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

MIT License

1.51k stars 281 forks source link

Hi, thanks for sharing this great work!

I'm trying to edit a new face. In https://github.com/genforce/interfacegan/issues/30, it is suggested using https://github.com/Puzer/stylegan-encoder firstly to get the new face latent code of W+ space. However, the shape of the latent code is (18, 512) and 18 layers have different values.

What confusing me is :

The shape of "stylegan_ffhq_age_w_boundary.npy" is (1,512), so if using (1, 512) boundary to edit (18,512) latent code, all layers will edit by the same value. But the meaning of different layers of (18,512) latent code is not the same, because the values of 18 layers are different.

Why can we use (1, 512) boundary to edit (18,512) latent code? Why it can also work?

If the (18,512) latent code has different values of its 18 layers, training a (18, 512) boundary (which also has different values of its 18 layers) is more reasonable, isn't it?
In your paper, you also do the experiment of real images. What the latent space did you get from your stylegan encode? Z, W or W+? If the shape of your latent code is (18,512), do 18 layers have different values?

Thank you!

genforce / interfacegan