huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.04k stars 5.36k forks source link

load_ip_adapter for distilled sd models #9528

Open kmpartner opened 1 month ago

kmpartner commented 1 month ago

Is it possible to load IP-Adapter for distilled SD v1 or v2 based models such as nota-ai/bk-sdm-tiny or nota-ai/bk-sdm-v2-tiny?

When I tried to load ip adapter using bk-sdm-tiny

pipe.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="models",
    weight_name="ip-adapter-plus_sd15.bin",
    low_cpu_mem_usage=False,
    ignore_mismatched_sizes=True
)

I got errors, probably because of differences in unet structures.

RuntimeError: Error(s) in loading state_dict for IPAdapterAttnProcessor2_0:
    size mismatch for to_k_ip.0.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
    size mismatch for to_v_ip.0.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).

How can I solve this problems?

asomoza commented 1 month ago

Hi, those models are distilled but also have blocks removed, even if the IP Adapters load they probably won't work or be good.

Also the IP Adapter you're using is for SD 1.5, and those models are SD 1.4 and 2.1 respectively, so its not a matter of if they're distilled or not, they're a different arch altogether.

kmpartner commented 1 month ago

Thank you for response. Do you think retraining or fine tuning on distilled model will work for IP-Adapter for distilled model after changing structure of IP-Adapter to match distilled model (IP-Adapter version of bk-sdm) ? Or is it same as training from scratch to distilled model?

asomoza commented 1 month ago

The problem here is not that is a distilled model or even missing blocks, they're different model architectures.

You can't finetune the ones we have for SD 1.4 or SD 2.1, so you'll need to train them from scratch. IP Adapters work out of the box for distilled models, for example you can use the SDXL ones with the Lighting or Hyper models.

For the case where is the same architecture but with missing blocks I haven't tested if they even work or if they're good, you can try with the SSD-1B using the SDXL IP Adapters.

kmpartner commented 1 month ago

Thank you for your response. Now I start to understand differences between bk-sdm and ssd-1b model, and importance of parent model. By the way, are there sdv1.5 based block removed distilled models, like ssd-1b for sdxl?

asomoza commented 1 month ago

Not that I know of, SD 1.5 was a small model that runs on most machines so there wasn't really the need to do that when it got popular.

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.