microsoft / VQ-Diffusion

Official implementation of VQ-Diffusion
MIT License
903 stars 63 forks source link

Hugging face API in the readme needs to be updated. #30

Open zzzc18 opened 1 year ago

zzzc18 commented 1 year ago

Running the old one gives:

ValueError: Pipeline <class 'diffusers.pipelines.vq_diffusion.pipeline_vq_diffusion.VQDiffusionPipeline'> expected {'vqvae', 'transformer', 'scheduler', 'learned_classifier_free_sampling_embeddings', 'tokenizer', 'text_encoder'}, but only {'vqvae', 'tokenizer', 'transformer', 'text_encoder', 'scheduler'} were passed.

And calling as this works fine:

import torch
# from diffusers import VQDiffusionPipeline
# pipeline = VQDiffusionPipeline.from_pretrained("microsoft/vq-diffusion-ithq", torch_dtype=torch.float16, revision="fp16")

from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("microsoft/vq-diffusion-ithq")

pipeline = pipeline.to("cuda")

image = pipeline("teddy bear playing in the pool").images[0]

# save image
image.save("./teddy_bear.png")

Also it seems like a problem of specifying torch_dtype=torch.float16, revision="fp16"

jS5t3r commented 1 year ago

I agree.

It is better to look here: https://huggingface.co/microsoft/vq-diffusion-ithq

Try this: pipeline = VQDiffusionPipeline.from_pretrained("microsoft/vq-diffusion-ithq", torch_dtype=torch.float16)