Puzer / stylegan-encoder

StyleGAN Encoder - converts real images to latent space
Other
1.07k stars 166 forks source link

Can we modify the code to generate a latent representation to create an array of different shape? #22

Open Uha661 opened 4 years ago

Uha661 commented 4 years ago

I'm trying to generate a latent representation with different dimension. I've changed the shape in generate_model.py but it is still trowing the error "ValueError: Dimension 1 in both shapes must be equal, but are 1 and 18. Shapes are [1,1,512] and [?,18,512]". Can anyone suggest me where else I should change the shape?

Please let me know if any further details are required

mr555ru commented 4 years ago

I have accomplished it in my fork: see commit diff. However, this change makes VGG16 fail to converge.

mr555ru commented 4 years ago

So, there is a quasi-solution that I find for this issue. The default network (Gs) outputs an image from [1,512] vector. It has two sub-networks: Gs.components.mapping and Gs.components.synthesis. The latter outputs an image from [18,512] dimension. That's what we learn with our VGG16. The "mapping" subnetwork can translate [1,512] latents into [18,512] dlatents - but (seemingly) not vice versa. So the quasi-solution would be to translate your [1,512] latents using Gs.components.mapping.run(latents, None) into [18,512] space and work with it, outputting images by Gs.components.synthesis.run(dlatents). Now this obviously doesn't solve the issue itself, but it might solve some of the questions that lead to it: e.g. we can mix [1,512] image with [18,512] face found by VGG16, if that's what we wanted to do initially.