Open denabazazian opened 3 years ago
Hi. You can try the optimization-based method proposed in the original StyleGAN2 paper: (unofficial implementation) https://github.com/rosinality/stylegan2-pytorch/blob/master/projector.py
@bryandlee Many thanks for your reply. I have tried to use the projector code from StyleGAN2. But, the latent_in
from that code is aligned with the generated projected image of the input. Does it mean that I should modify lines #170 and #173 to get the latent_in
directly from the input image regardless of sample_noise
and latent_mean
? Or am I missing something?
Hi, I don't quite get what you mean by "getting the latent_in
directly from the input image regardless of sample_noise
and latent_mean
". The code finds the latent vectors and noises that can be fed into the generator to generate the closest projection of a given input image.
Yes, the projector code generates the closest projection of a given input image, and the problem is that in most of the cases the view-point and some features of the input images are changed. So, the semantic segmentation result is not corresponding to the input image. In the Supplementary Material of the paper, it is written that the input image is fed into a Pix2Pix’s encoder to construct a pixel-wise representation. I am just wondering if there is any further implementation or explanation regarding that. Thanks.
I see. The "auto-shot segmentation" part of the paper is not implemented, but you can sample image-label pairs from the few-shot model and use them to train any semantic segmentation model. I'll let you know if I have a chance to do it.
I am wondering how can I evaluate the model by a real image instead of a generated image by StyleGAN.
The input image usually should be embed into the latent space of GAN by a latent optimizer to reproduces the input image and extract a representation from an image. However, I cannot find this
latent optimizer
in the code. Did you feed an input image into Pix2Pix’s encoder and use activation maps from all convolutional layers of the generator (decoder) to construct a pixel-wise representation?Would it be possible to release the code for testing input images?
Thanks for your great work!