segmind / distill-sd

Segmind Distilled diffusion
https://discord.gg/p2MdJqZXnb
Other
568 stars 36 forks source link

How to infer using the trained model? #18

Closed littletomatodonkey closed 11 months ago

littletomatodonkey commented 11 months ago

Hi, thanks for your great job! I want to test the trained model by distill_train.py. The inference code is as follows.

import torch
from diffusers import DiffusionPipeline
from diffusers import DPMSolverMultistepScheduler
from torch import Generator

path = "sd-laion-art/"
# Insert your prompt below.
prompt = "Faceshot Portrait of pretty young (18-year-old) Caucasian wearing a high neck sweater, (masterpiece, extremely detailed skin, photorealistic, heavy shadow, dramatic and cinematic lighting, key light, fill light), sharp focus, BREAK epicrealism"
# Insert negative prompt below. We recommend using this negative prompt for best results.
negative_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck" 

torch.set_grad_enabled(False)
torch.backends.cudnn.benchmark = True

# Below code will run on gpu, please pass cpu everywhere as the device and set 'dtype' to torch.float32 for cpu inference.
with torch.inference_mode():
    gen = Generator("cuda")
    gen.manual_seed(1674753452)
    pipe = DiffusionPipeline.from_pretrained(path, torch_dtype=torch.float16, safety_checker=None, requires_safety_checker=False)
    pipe.to('cuda')
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    pipe.unet.to(device='cuda', dtype=torch.float16, memory_format=torch.channels_last)

    img = pipe(prompt=prompt,negative_prompt=negative_prompt, width=512, height=512, num_inference_steps=25, guidance_scale = 7, num_images_per_prompt=1, generator = gen).images[0]
    img.save("image.png")

However ,the following error occurs.

ValueError: Cannot load <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'> from sd-laion-art/unet because the following keys are missing: 
 down_blocks.2.resnets.1.conv2.bias, up_blocks.2.resnets.2.norm2.weight, up_blocks.1.attentions.2.transformer_blocks.0.attn1.to_out.0.bias, up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight, down_blocks.0.attentions.1.transformer_blocks.0.attn1.to_k.weight, down_blocks.t...

Could you please tell me how to infer with the model trained by distill_train.py ? Thanks!

littletomatodonkey commented 11 months ago

I sovled this problem by replace sd-laion-art/unet/model_config.json with this file, and it works, thanks!