Closed nbardy closed 1 year ago
Thanks for the issue - we should indeed provide better better docs here cc @sayakpaul do we have updated unclip docs already?
Hi,
I looked into this a bit. https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip shouldn't be actually used with StableUnCLIPPipeline
as its details are not very clear to us as of now.
https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip-small, whereas, is fine and we know that the Karlo model was used as one of the priors.
The following code works:
import torch
from diffusers import UnCLIPScheduler, DDPMScheduler, StableUnCLIPPipeline
from diffusers.models import PriorTransformer
from transformers import CLIPTokenizer, CLIPTextModelWithProjection, CLIPTextModel
prior_model_id = "kakaobrain/karlo-v1-alpha"
prior = PriorTransformer.from_pretrained(prior_model_id, subfolder="prior")
prior_text_model_id = "openai/clip-vit-large-patch14"
prior_tokenizer = CLIPTokenizer.from_pretrained(prior_text_model_id)
prior_text_model = CLIPTextModelWithProjection.from_pretrained(prior_text_model_id)
prior_scheduler = UnCLIPScheduler.from_pretrained(prior_model_id, subfolder="prior_scheduler")
prior_scheduler = DDPMScheduler.from_config(prior_scheduler.config)
stable_unclip_model_id = "stabilityai/stable-diffusion-2-1-unclip-small"
pipe = StableUnCLIPPipeline.from_pretrained(
stable_unclip_model_id,
prior_tokenizer=prior_tokenizer,
prior_text_encoder=prior_text_model,
prior=prior,
prior_scheduler=prior_scheduler,
)
pipe = pipe.to("cuda")
wave_prompt = "dramatic wave, the Oceans roar, Strong wave spiral across the oceans as the waves unfurl into roaring crests; perfect wave form; perfect wave shape; dramatic wave shape; wave shape unbelievable; wave; wave shape spectacular"
images = pipe(prompt=wave_prompt).images
images[0].save("tarsila_variation.png")
But, its fp16 variant is not working at the moment. @williamberman could you look into this?
After specifying torch_dtype=torch.float16
while initializing the StableUnCLIPPipeline
when you call pipe(prompt=wave_prompt)
, it should lead to:
RuntimeError: mat1 and mat2 must have the same type
Here's a Colab Notebook that fully reproduces this error.
Cc: @patrickvonplaten. I would like the ^ issue get resolved first and then I will drop a PR to update the docs for StableUnCLIPPipeline
.
Also cc: @apolinario to note the isolation of components when initializing the pipeline.
@sayakpaul the components loaded separately from the pipeline need to be loaded in fp16 if the pipeline is loaded in fp16
I think this is ok and is the expected api. We could use a heuristic and check a parameter for the loaded pipelines and model components to check if they're the same dtype and add a warning log. However, I don't think that's super high priority if you have time to add, feel free
import torch
from diffusers import UnCLIPScheduler, DDPMScheduler, StableUnCLIPPipeline
from diffusers.models import PriorTransformer
from transformers import CLIPTokenizer, CLIPTextModelWithProjection, CLIPTextModel
prior_model_id = "kakaobrain/karlo-v1-alpha"
prior = PriorTransformer.from_pretrained(prior_model_id, subfolder="prior", torch_dtype=torch.float16)
prior_text_model_id = "openai/clip-vit-large-patch14"
prior_tokenizer = CLIPTokenizer.from_pretrained(prior_text_model_id)
prior_text_model = CLIPTextModelWithProjection.from_pretrained(prior_text_model_id, torch_dtype=torch.float16)
prior_scheduler = UnCLIPScheduler.from_pretrained(prior_model_id, subfolder="prior_scheduler")
prior_scheduler = DDPMScheduler.from_config(prior_scheduler.config)
stable_unclip_model_id = "stabilityai/stable-diffusion-2-1-unclip-small"
pipe = StableUnCLIPPipeline.from_pretrained(
stable_unclip_model_id,
torch_dtype=torch.float16,
variant="fp16",
prior_tokenizer=prior_tokenizer,
prior_text_encoder=prior_text_model,
prior=prior,
prior_scheduler=prior_scheduler,
)
pipe = pipe.to("cuda")
wave_prompt = "dramatic wave, the Oceans roar, Strong wave spiral across the oceans as the waves unfurl into roaring crests; perfect wave form; perfect wave shape; dramatic wave shape; wave shape unbelievable; wave; wave shape spectacular"
images = pipe(prompt=wave_prompt).images
images[0].save("tarsila_variation.png")
@nbardy created a PR to address the doc problem https://github.com/huggingface/diffusers/pull/2897. I am closing this issue for now, but feel free to re-open it.
Describe the bug
StableUnCLIPPipeline doesn't work with default values. It's missing the prior models. I'm trying to update them here and even converted a checkpoint model, but can't see to get it working yet.
Reproduction
!pip install git+https://github.com/huggingface/diffusers@main transformers accelerate scipy safetensors xformers
import requests import torch from PIL import Image from io import BytesIO from diffusers import UnCLIPScheduler, DDPMScheduler from diffusers.models import PriorTransformer from transformers import CLIPTokenizer, CLIPTextModelWithProjection from diffusers import StableUnCLIPPipeline, UNet2DConditionModel
karlo_model = "kakaobrain/karlo-v1-alpha" prior = PriorTransformer.from_pretrained(karlo_model, subfolder="prior")
clip_name = "openai/clip-vit-large-patch14"
clip_name = "laion/CLIP-ViT-H-14-laion2B-s32B-b79K
prior_tokenizer = CLIPTokenizer.from_pretrained(clip_name) prior_text_model = CLIPTextModelWithProjection.from_pretrained(clip_name)
prior_scheduler = UnCLIPScheduler.from_pretrained(karlo_model, subfolder="prior_scheduler") prior_scheduler = DDPMScheduler.from_config(prior_scheduler.config)
unet = UNet2DConditionModel.from_pretrained("Nbardy/stable-diffusion-unclip-diffusers", subfolder="unet")
Start the StableUnCLIP Image variations pipeline
pipe = StableUnCLIPPipeline.from_pretrained( "stabilityai/stable-diffusion-2-1-unclip",
revision="sd21-unclip-l.ckpt",
)
pipe = pipe.to('cuda') wave_prompt = "dramatic wave, the Oceans roar, Strong wave spiral across the oceans as the waves unfurl into roaring crests; perfect wave form; perfect wave shape; dramatic wave shape; wave shape unbelievable; wave; wave shape spectacular" negative_prompt = "((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), wierd colors, blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy, 3d render, bad-artist bad_prompt_version2"
Pipe to make the variation
images = pipe(prompt=wave_prompt).images images[0].save("tarsila_variation.png") display(images[0])
Logs
No response
System Info
!pip install git+https://github.com/huggingface/diffusers@main transformers accelerate scipy safetensors xformers
in colab is my setup
https://colab.research.google.com/drive/1y7som7KnaTOWuXAWYDIkCTVSm2otX9_R?usp=sharing