Closed PangziZhang523 closed 2 years ago
Not sure if relevant, but you can find the following in the original StyleGAN paper:
A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras (NVIDIA), Samuli Laine (NVIDIA), Timo Aila (NVIDIA) https://arxiv.org/abs/1812.04948 https://github.com/NVlabs/stylegan
Thank you for the answer. My understanding is that the 'style' in stylegan refers to the style parameter in Adain, which is not the same as the latent space in stylespace.
Thanks for the answer, I probably understand the relationship between s and w. Then https://github.com/betterze/StyleSpace/blob/main/manipulate.py#L229 can also transfer w to s, right? , Because they are all affine transformations, but w is a 1*512 vector, and w+ is 18 w. Another question is how Inversion gets s when comparing in Figure 18. If it is from w to s, the results of w and s should be the same because there is no editing.
The function can transfer w+ to s. To get w+ from w, just repeat the function 18 times, so the shape will be from (1,512) to (1, 18, 512). In this way, we can transfer w -> w+ -> s.
In figure 18, we compare different spaces for inversion. Given the same input image, we invert it to w, w+ or s space separately. The information is flowed from image to latent space (image->s->w+), rather than from latent space to image (w+->s->image). Since there are affine transformations from w+ to s, so inversion to w+ will be more constrained than that to s, the s reconstruction will be slightly better than w+ reconstruction.
Thank you for the answer. How did you get the s in Inversion in Figure 18 directly? There is something unclear here. Could you please answer it again? That is image->s.
In figure 18, the latent codes are obtained through latent optimization. Here is standard latent optimization code from StyleGAN2 paper (section 5). It optimizes in W space, we modify this code to S or W+ space.
If you want to do latent optimization in S space, we suggest you to use the pytorch implementation, which is easier to play with.
Thanks a lot for answers. There is also a problem when looking for the channel of a specific attribute. I only used a male classifier to get the attribute channel which is different from your gender (9, 6). I used 50 positive images, and the layer and channel I found were (8, 223).But this is not the channel that determines gender.
In the paper, we claim that just using 20-30 example images, we can have top 5 accuracy higher than 90%. It is the top 5 accuracy, rahter than top 1 accuracy.
For the case you show above, the target channel 9_6 is in rank 5. So for top 5 accuracy, it is a success. For top 1 accuracy, it is a failure.
Thanks for the interesting work. I would like to ask how did you get s space? In the paper, it is said that it is obtained by the In-domain gan method, but I read the In-domain gan paper to get W space. Is s space obtained on W space or is it a separate space?