huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.97k stars 5.35k forks source link

Flux LoRA loading failure #9803

Open jonluca opened 2 days ago

jonluca commented 2 days ago

Describe the bug

When loading in this model you get a ValueError(f"Incompatible keys detected: \n\n {', '.join(remaining_keys)}") error

It looks like it's a LoRA type that doesn't fall into any of the classic conversion script paths.

It looks like a Flux diffusers checkpoint merged with an xlabs lora? I can't really tell.

There aren't a lot of scripts out there to convert this format, but it does seem to get processed properly by ComfyUI.

https://huggingface.co/city96/FLUX.1-dev-gguf/discussions/13

If this is a good starting point, the prefix is slightly different too, since there are scripts that look for "model.diffusion_model." but not just raw "diffusion_model.".


def convert_flux_transformer_checkpoint_to_diffusers(checkpoint, **kwargs):
    converted_state_dict = {}
    keys = list(checkpoint.keys())
    for k in keys:
        if "model.diffusion_model." in k:
            checkpoint[k.replace("model.diffusion_model.", "")] = checkpoint.pop(k)

        if "diffusion_model." in k:
            checkpoint[k.replace("diffusion_model.", "")] = checkpoint.pop(k)

Reproduction

Download the safetensors and load them in as a LoRA in a flux pipeline

downloaded_file = "pytorch.safetensors"
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev ", torch_dtype=torch.bfloat16)
pipe.load_lora_weights(downloaded_file)

Logs

ValueError: Incompatible keys detected: 

 diffusion_model.double_blocks.0.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.0.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.0.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.0.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.0.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.0.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.0.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.0.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.1.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.1.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.1.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.1.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.1.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.1.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.1.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.1.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.10.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.10.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.10.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.10.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.10.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.10.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.10.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.10.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.11.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.11.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.11.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.11.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.11.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.11.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.11.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.11.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.12.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.12.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.12.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.12.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.12.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.12.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.12.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.12.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.13.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.13.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.13.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.13.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.13.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.13.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.13.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.13.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.14.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.14.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.14.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.14.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.14.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.14.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.14.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.14.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.15.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.15.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.15.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.15.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.15.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.15.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.15.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.15.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.16.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.16.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.16.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.16.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.16.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.16.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.16.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.16.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.17.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.17.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.17.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.17.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.17.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.17.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.17.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.17.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.18.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.18.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.18.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.18.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.18.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.18.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.18.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.18.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.2.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.2.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.2.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.2.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.2.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.2.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.2.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.2.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.3.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.3.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.3.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.3.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.3.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.3.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.3.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.3.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.4.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.4.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.4.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.4.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.4.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.4.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.4.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.4.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.5.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.5.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.5.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.5.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.5.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.5.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.5.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.5.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.6.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.6.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.6.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.6.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.6.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.6.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.6.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.6.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.7.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.7.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.7.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.7.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.7.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.7.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.7.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.7.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.8.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.8.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.8.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.8.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.8.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.8.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.8.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.8.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.9.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.9.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.9.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.9.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.9.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.9.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.9.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.9.txt_attn.qkv.lora_up.weight

System Info

Who can help?

@sayakpaul

jonluca commented 2 days ago

This is the model https://civitai.com/models/631986

sayakpaul commented 2 days ago

Seems like it's GGUF? If so, we currently don't support it but @DN6 is working on it.

If it also seems like you've figured out a fix?

If this is a good starting point, the prefix is slightly different too, since there are scripts that look for "model.diffusion_model." but not just raw "diffusion_model.".

If so, would you maybe like to open a PR?