huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.44k stars 5.45k forks source link

Flux LoRA loading failure #9803

Open jonluca opened 1 month ago

jonluca commented 1 month ago

Describe the bug

When loading in this model you get a ValueError(f"Incompatible keys detected: \n\n {', '.join(remaining_keys)}") error

It looks like it's a LoRA type that doesn't fall into any of the classic conversion script paths.

It looks like a Flux diffusers checkpoint merged with an xlabs lora? I can't really tell.

There aren't a lot of scripts out there to convert this format, but it does seem to get processed properly by ComfyUI.

https://huggingface.co/city96/FLUX.1-dev-gguf/discussions/13

If this is a good starting point, the prefix is slightly different too, since there are scripts that look for "model.diffusion_model." but not just raw "diffusion_model.".


def convert_flux_transformer_checkpoint_to_diffusers(checkpoint, **kwargs):
    converted_state_dict = {}
    keys = list(checkpoint.keys())
    for k in keys:
        if "model.diffusion_model." in k:
            checkpoint[k.replace("model.diffusion_model.", "")] = checkpoint.pop(k)

        if "diffusion_model." in k:
            checkpoint[k.replace("diffusion_model.", "")] = checkpoint.pop(k)

Reproduction

Download the safetensors and load them in as a LoRA in a flux pipeline

downloaded_file = "pytorch.safetensors"
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev ", torch_dtype=torch.bfloat16)
pipe.load_lora_weights(downloaded_file)

Logs

ValueError: Incompatible keys detected: 

 diffusion_model.double_blocks.0.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.0.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.0.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.0.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.0.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.0.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.0.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.0.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.1.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.1.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.1.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.1.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.1.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.1.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.1.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.1.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.10.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.10.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.10.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.10.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.10.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.10.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.10.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.10.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.11.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.11.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.11.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.11.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.11.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.11.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.11.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.11.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.12.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.12.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.12.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.12.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.12.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.12.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.12.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.12.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.13.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.13.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.13.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.13.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.13.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.13.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.13.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.13.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.14.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.14.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.14.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.14.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.14.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.14.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.14.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.14.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.15.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.15.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.15.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.15.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.15.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.15.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.15.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.15.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.16.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.16.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.16.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.16.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.16.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.16.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.16.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.16.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.17.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.17.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.17.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.17.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.17.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.17.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.17.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.17.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.18.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.18.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.18.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.18.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.18.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.18.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.18.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.18.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.2.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.2.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.2.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.2.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.2.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.2.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.2.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.2.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.3.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.3.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.3.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.3.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.3.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.3.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.3.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.3.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.4.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.4.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.4.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.4.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.4.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.4.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.4.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.4.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.5.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.5.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.5.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.5.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.5.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.5.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.5.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.5.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.6.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.6.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.6.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.6.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.6.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.6.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.6.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.6.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.7.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.7.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.7.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.7.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.7.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.7.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.7.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.7.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.8.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.8.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.8.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.8.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.8.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.8.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.8.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.8.txt_attn.qkv.lora_up.weight, diffusion_model.double_blocks.9.img_attn.proj.lora_down.weight, diffusion_model.double_blocks.9.img_attn.proj.lora_up.weight, diffusion_model.double_blocks.9.img_attn.qkv.lora_down.weight, diffusion_model.double_blocks.9.img_attn.qkv.lora_up.weight, diffusion_model.double_blocks.9.txt_attn.proj.lora_down.weight, diffusion_model.double_blocks.9.txt_attn.proj.lora_up.weight, diffusion_model.double_blocks.9.txt_attn.qkv.lora_down.weight, diffusion_model.double_blocks.9.txt_attn.qkv.lora_up.weight

System Info

Who can help?

@sayakpaul

jonluca commented 1 month ago

This is the model https://civitai.com/models/631986

sayakpaul commented 1 month ago

Seems like it's GGUF? If so, we currently don't support it but @DN6 is working on it.

If it also seems like you've figured out a fix?

If this is a good starting point, the prefix is slightly different too, since there are scripts that look for "model.diffusion_model." but not just raw "diffusion_model.".

If so, would you maybe like to open a PR?

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 2 days ago

@jonluca a gentle ping