Closed youngjae-git closed 3 years ago
Hi Youngjae, I'm not fully clear about your question. The anycost generator can generate images of resolutions 128/256/512/1024. You can use the intermediate output for resolution 256.
To project an image into the latent space, we downsample the target image to resolution 256 when computing LPIPS following common practice. You can project a higher resolution image (e.g., 1024).
Hi, @tonylins Thank you your fast replay.
(Left image is input FFHQ image, right image is generated image from latent of encoder)
What I'm trying to do is check out the "projection". However, after resizing the 1024x1024 FFHQ image to 256x256 and projecting it, a completely different person came out. Please tell me if there is anything I missed. The code below is the code I used.
`
from models.anycost_gan import Generator import torch from torchvision import models from utils.torch_utils import safe_load_state_dict_from_url
URLTEMPLATE = 'https://hanlab.mit.edu/projects/anycost-gan/files/{}{}.pt' g_model = 'generator' g_config = 'anycost-ffhq-config-f' g_url = URL_TEMPLATE.format(g_model, g_config)
resolution = 1024 channel_multiplier = 2 key = 'g_ema'
g_model = Generator(resolution, channel_multiplier=channel_multiplier) model_dir = '/root/.cache/torch/hub/checkpoints' sd = torch.hub.load_state_dict_from_url(g_url) g_model.load_state_dict(sd['g_ema']) g_model.eval()
from models.encoder import ResNet50Encoder e_model = 'encoder' e_config = 'anycost-ffhq-config-f' e_url = URL_TEMPLATE.format(e_model, e_config)
n_style = 18 style_dim = 512 key = 'state_dict'
e_model = ResNet50Encoder(n_style=n_style, style_dim=style_dim) sd = torch.hub.load_state_dict_from_url(e_url) e_model.load_state_dict(sd['state_dict']) e_model.eval()
img_1024 = Image.open('/nas/data/ffhq/images1024x1024/00000.png') img_256 = img_1024.resize((256,256)) trans_img = transforms.ToTensor()(img_256) trans_img = trans_img.view(1,3,256,256) latent = e_model(trans_img) latent.shape # (1,18,512)
img_out_np = get_4x4_grid(g_model, latent) plt.figure(figsize = (8,8)) plt.imshow(img_out_np) plt.axis('off')
`
Hi Youngjae, to use the encoder, you need to normalize the image into range [-1, 1] using transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
. The current image range is [0, 1] in your code.
Hi Youngjae, I will close the issue due to inactivity. Feel free to reopen if the problem is not solved.
Hi, @tonylins Thank you for your good paper.
In the case of this Github, only 256x256 resolution can be encoded. However, it seems that only the resolutions of 1024x1024 and 512x512 are uploaded through the decoder.
What I want to test is to encode and decode a 256x256 image and check whether the same image as the original image comes out.
Can you send me 256x256 anycost-ffhq decoder weight?