eladrich / pixel2style2pixel

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework
https://eladrich.github.io/pixel2style2pixel/
MIT License
3.19k stars 570 forks source link

latent image editing #306

Closed federicoromeo closed 1 year ago

federicoromeo commented 1 year ago
  1. I can encode real images (3,256,256) obtaining a latent vector of shape (18,512)
  2. Performing the encoding on many images, such as celebahq labelled dataset, i want to find the latent editing directions for some the annotated attributes. In particular i'll use an svm to extract a binary decision boundary. The shape of the decision boundary will correctly be the same as the encoding, (18,512)
  3. At the end, my goal is to edit a sampled vector by applying edited_vector = sampled_vector + k * latent_direction and later decode it to retrieve the final edited image

The issue is that to generate (=decode) an image i should pass to the psp network a vector of shape (1,512), but the encoded latent vector is of shape (18,512). How can i decode an image starting from this shape?

PS: I think the problem is about the various latent spaces of StyleGANs, z w or w+. is the space (1,512) = w and (18,512) = w+?

yuval-alaluf commented 1 year ago

You can edit latents in W+ even if the learned direction was in W. The direction will simply be applied to all 18 vectors of W+. You can find some editing code / directions in the ReStyle repo: https://github.com/yuval-alaluf/restyle-encoder#editing Integrating pSp with that code should be easy since its built on the same code.