tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
4.52k stars 298 forks source link

How to convert a trained IP-Adapter of SDXL to fit ControlNet Canny #206

Open ZhouYaoxue opened 6 months ago

ZhouYaoxue commented 6 months ago

Thanks for the excellent work! But i am struggling a problem when using ControlNet Canny with my own trained IP-Adapter SDXL model as below. Could you please give me favor?

import torch
from diffusers import StableDiffusionXLControlNetImg2ImgPipeline
from PIL import Image
from ip_adapter import IPAdapterXL

base_model_path = "stabilityai/stable-diffusion-xl-base-1.0"
image_encoder_path = "models/image_encoder"
# ip_ckpt = "sdxl_models/ip-adapter_sdxl_vit-h.bin"
ip_ckpt = "ouput_models/ip_adapter.bin"  
# which i get from my own trained IP-Adapter of SDXL, then use the script as #168 
device = "cuda:0"

controlnet_path ="diffusers/controlnet-canny-sdxl-1.0"
controlnet = ControlNetModel.from_pretrained(controlnet_path, variant="fp16", use_safetensors=True, torch_dtype=torch.float16).to(device)
pipe = StableDiffusionXLControlNetImg2ImgPipeline.from_pretrained(
    base_model_path,
    controlnet=controlnet,
    use_safetensors=True,
    torch_dtype=torch.float16,
    add_watermarker=False,
).to(device)

ip_model = IPAdapterXL(pipe, image_encoder_path, ip_ckpt, device)

the conversion script is as https://github.com/tencent-ailab/IP-Adapter/issues/168#issue-2032046175

the error info:

RuntimeError: Error(s) in loading state_dict for ImageProjModel: size mismatch for proj.weight: copying a param with shape torch.Size([8192, 1280]) from checkpoint, the shape in current model is torch.Size([8192, 1024]).

image

xiaohu2015 commented 6 months ago

you should use ip_ckpt = "sdxl_models/ip-adapter_sdxl_vit-h.bin"

it seems that your ip-adapter use sdxl_models/image_encoder

ZhouYaoxue commented 6 months ago

oh, i made a mistake. thanks!