rosinality / stylegan2-pytorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch
MIT License
2.71k stars 619 forks source link

Generating from 18x512 latents #41

Open jjennings955 opened 4 years ago

jjennings955 commented 4 years ago

First, thanks for making this library.

I have some 18x512 latents that I optimized using the official version. What parameters should I use to generate the same image from those latents as the official version? I'm using the same weights but getting different results.

rosinality commented 4 years ago

If you use latent variables instead of noise inputs, you can use input_is_latent in the forward function of the generator.

ElHacker commented 4 years ago

I'm having the same issue, I'm trying to use a couple of external images and do a style mixing withm interpolation in between the external images

The process I'm following is:

  1. run this command for both external images python projector.py --ckpt ../ffhq-dataset/550000.pt --step=1000 --w_plus --size 256 style_mix/external1.jpg
  2. This generates the correct latent space and noise values for both external images
  3. I load the .pt file into memory in the generator.py
  4. Heres my modified the code in the generator.py to do
    data_1 = torch.load('external1.pt')
    data_2 = torch.load('external2.pt')
    latent_1 = data_1['style_mix/external1.jpg']['latent']
    latent_2 = data_2['style_mix/external2.jpg']['latent']
    sample, _ = g_ema([latent_1, latent_2], inject_index=i, input_is_latent=True)
  5. Run the generator with python generate.py --size 256 --pics 1 --ckpt ../ffhq-dataset/550000.pt

Latent_1 and latent_2 shapes are 14x512, which makes the generator create a strip of 14 images instead of the 1 original image I sent over to the projector.

Is there any way to have the generator use the original external images I fed the projector instead of it thinking that there are 14 images?

Here's the images I'm feeding and the sample output I get from the generator.

External Image after projector is run profile-project Generated Image 000014

rosinality commented 4 years ago

@ElHacker You can concat & unsqueeze two latent codes like this: torch.cat((latent_1[:index], latent_2[index:]), 0).unsqueeze(0)

ElHacker commented 4 years ago

Thanks @rosinality that worked.

Leaving my code here if someone is having issues with this

latent_1 = data_1['style_mix/image1.png']['latent'].reshape(1, batch_size, 512)
latent_2 = data_2['style_mix/image2.png']['latent'].reshape(1, batch_size, 512)
latent = torch.cat((latent_1[:, :i], latent_2[:, i:]), 1)
sample, _ = g_ema([latent], inject_index=i, input_is_latent=True)

A slight adjustment I had to make to the tensors was to add an extra dimension at the beginning. For example: if latent shape is 14x512 I had to reshape to 1x14X512 so that the model could generate the correct image. No unsqueeze was needed on my case.

batch_size is the number rows of the latent vector. For example if the latent vector shape is 14x512, then batch_size will be 14