openai / consistency_models

Official repo for consistency models.
MIT License
6.02k stars 409 forks source link

RuntimeError in the sample `diffusers` code in README.md #44

Closed take2rohit closed 10 months ago

take2rohit commented 11 months ago

We get "RuntimeError: Expected tensor for argument # 1 'indices' to have scalar type Long; but got torch.IntTensor instead (while checking arguments for embedding)" when we run

import torch
from diffusers import ConsistencyModelPipeline

device = "cuda"
# Load the cd_imagenet64_l2 checkpoint.
model_id_or_path = "openai/diffusers-cd_imagenet64_l2"
pipe = ConsistencyModelPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
pipe.to(device)
# Onestep sampling, class-conditional image generation
# ImageNet-64 class label 145 corresponds to king penguins
image = pipe(num_inference_steps=1, class_labels=145).images[0]
image.save("cd_imagenet64_l2_onestep_sample_penguin.png")

# Multistep sampling, class-conditional image generation
# Timesteps can be explicitly specified; the particular timesteps below are from the original Github repo:
# https://github.com/openai/consistency_models/blob/main/scripts/launch.sh#L77
image = pipe(num_inference_steps=None, timesteps=[22, 0], class_labels=145).images[0]
image.save("cd_imagenet64_l2_multistep_sample_penguin.png")

Corrected in Pull requests #43

RossoneriZhao commented 11 months ago

When I run the offered code, I never get a "RuntimeError", I just get an image that I can not even tell what it is. By the way, I get a Warn, here is the logs: "Some weights of the model checkpoint were not used when initializing UNet2DModel: ['mid_block.attentions.0.group_norm.weight, mid_block.attentions.0.group_norm.bias']"

RossoneriZhao commented 11 months ago

I have solved this problem, diffusers' source code misses a group norm layer in its mid_block init, which leads to a bad result.