huggingface / open-muse

Open reproduction of MUSE for fast text2image generation.
https://huggingface.co/openMUSE
Apache License 2.0
334 stars 27 forks source link

Bug in the readme #121

Open Msadat97 opened 1 year ago

Msadat97 commented 1 year ago

Hello,

Thanks for providing this repo.

I wanted to mention that there is a bug in the readme file for the VQGAN example. A working version would be this:


import torch
from torchvision import transforms
from PIL import Image
from muse import MaskGitVQGAN
import numpy as np # <--- added

torch.set_grad_enabled(False) # <--- added

# Load the pre-trained vq model from the hub
vq_model = MaskGitVQGAN.from_pretrained("openMUSE/maskgit-vqgan-imagenet-f16-256")

# encode and decode images using
encode_transform = transforms.Compose( # <--- fixed
    [
        transforms.Resize(256, interpolation=transforms.InterpolationMode.BILINEAR),
        transforms.CenterCrop(256),
        transforms.ToTensor(),
    ]
)
image = Image.open("/content/ILSVRC2012_val_00000028.JPEG") #
pixel_values = encode_transform(image).unsqueeze(0)
image_tokens, _ = vq_model.encode(pixel_values)
rec_image = vq_model.decode(image_tokens)

# Convert to PIL images
rec_image = 2.0 * rec_image - 1.0
rec_image = torch.clamp(rec_image, -1.0, 1.0)
rec_image = (rec_image + 1.0) / 2.0
rec_image *= 255.0
rec_image = rec_image.permute(0, 2, 3, 1).cpu().numpy().astype(np.uint8)
pil_images = [Image.fromarray(image) for image in rec_image]
isamu-isozaki commented 1 year ago

@Msadat97 Sorry for the mistake. Can you try out this colab? I recently made it for testing purposes for laion

Msadat97 commented 1 year ago

yes the colab works fine