lucidrains / lightweight-gan

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two
MIT License
1.63k stars 221 forks source link

Projecting generated Images to Latent Space #138

Open demiahmed opened 2 years ago

demiahmed commented 2 years ago

Is there any way to reverse engineer the generated images into the latent space?

I am trying to embed fresh RGB as well as ones generated by the Generator into the latent space so I can find its nearest neighbour, pretty much like AI image editing tools.

I plan to convert my RGB image into tensor embeddings based on my trained model and tweak the feature vectors.

How can I achieve this with lightweight-gan?

mlerma54 commented 1 year ago

Your question seems to be about GAN inversion, i.e., mapping images to latent codes that . There are various ways to accomplish that, see e.g. this survey:

Xia et al (2022): GAN Inversion: A Survey - https://arxiv.org/pdf/2101.05278.pdf

I have experimented briefly with two techniques:

  1. Performing a training process that modifies the latent vector using a loss function that measures how similar is the generated image to the target image.

  2. Training an encoder that learns the latent vector that produces a set of target images used for training.

The former approach is time consuming since it requires to repeat the training for each target image. The latter requires only one training, but the quality of the reproduction may be more limited since it is required to work for many images rather than just one target image. An hybrid approach consists of training an encoder that produces a first candidate latent vector for reconstruction and then refine it with further optimization.

There are several options for the reconstruction loss depending on the similarity measure selected, one is pixel-wise reconstruction, but depending on needs there are other choices such as structural similarity and learned perceptual image patch similarity.