leejet / stable-diffusion.cpp

Stable Diffusion and Flux in pure C/C++
MIT License
3.25k stars 273 forks source link

Some Flux loras can't be applied #370

Open stduhpf opened 2 weeks ago

stduhpf commented 2 weeks ago

Some Flux loras are working, but others refuse to load properly (even though they work fine on comfyui). Around 50 % of the loras I've tested where failing.

There is no crash, but I get lots of [WARN ] lora.hpp:176 - unused lora tensor transformer.xxx, and it ends with [WARN ] lora.hpp:186 - Only (0 / N) LoRA tensors have been applied.

I'm not 100% sure what's going on, but a pattern I think I've noticed is that the Loras that can load all have 912 tensors, while all the ones who fail all have some different number of tensors?

Example of lora-related logs ``` [WARN ] stable-diffusion.cpp:617 - In quantized models when applying LoRA, the images have poor quality. [INFO ] stable-diffusion.cpp:635 - Attempting to apply 1 LoRAs [INFO ] model.cpp:789 - load ..\ComfyUI\models\loras\ana_de_armas_flux_lora_v1_000002000.safetensors using safetensors format [DEBUG] model.cpp:857 - init from '..\ComfyUI\models\loras\ana_de_armas_flux_lora_v1_000002000.safetensors' [INFO ] lora.hpp:33 - loading LoRA from '..\ComfyUI\models\loras\ana_de_armas_flux_lora_v1_000002000.safetensors' [DEBUG] model.cpp:1526 - loading tensors from ..\ComfyUI\models\loras\ana_de_armas_flux_lora_v1_000002000.safetensors [DEBUG] ggml_extend.hpp:1029 - lora params backend buffer size = 327.75 MB(RAM) (988 tensors) [DEBUG] model.cpp:1526 - loading tensors from ..\ComfyUI\models\loras\ana_de_armas_flux_lora_v1_000002000.safetensors [DEBUG] lora.hpp:69 - finished loaded lora [WARN ] lora.hpp:176 - unused lora tensor transformer.single_transformer_blocks.0.attn.to_k.lora_A.weight [WARN ] lora.hpp:176 - unused lora tensor transformer.single_transformer_blocks.0.attn.to_k.lora_B.weight [...] (This goes on for all the tensors in the model) [WARN ] lora.hpp:176 - unused lora tensor transformer.transformer_blocks.9.norm1_context.linear.lora_A.weight [WARN ] lora.hpp:176 - unused lora tensor transformer.transformer_blocks.9.norm1_context.linear.lora_B.weight [WARN ] lora.hpp:186 - Only (0 / 988) LoRA tensors have been applied ```
Green-Sky commented 2 weeks ago

@stduhpf please dont copy reply to likely malicious binaries!

stduhpf commented 2 weeks ago

@stduhpf please dont copy reply to likely malicious binaries!

Oops, edited out

grauho commented 2 weeks ago

Based on the log you posted my initial thought is that some of the LoRAs you're using have a non-standard naming convention so when they looks for the corresponding tensors in the model it can't find them. Might be the fact that they're using a _{A,B} convention instead of _{up,down} but that's just a hunch at the moment.

stduhpf commented 2 weeks ago

Hmm could be... I'll try replacing "up" and "down" in the code with "A" and "B" and see how it goes.

grauho commented 2 weeks ago

Hmm could be... I'll try replacing "up" and "down" in the code with "A" and "B" and see how it goes.

Another option is just to run a quick grep on the LoRAs that do load without an issue for that pattern and then on the ones that don't and see if it's a straight one to one correlation

Because of how the name preprocessing works I'm not sure that just replacing the up / down bit in model.cpp will be enough unless the convention in the non-functional LoRAs happens to be identical to a supported convention except for that one suffix.

stduhpf commented 2 weeks ago

Hmm could be... I'll try replacing "up" and "down" in the code with "A" and "B" and see how it goes.

Another option is just to run a quick grep on the LoRAs that do load without an issue for that pattern and then on the ones that don't and see if it's a straight one to one correlation

Because of how the name preprocessing works I'm not sure that just replacing the up / down bit in model.cpp will be enough unless the convention in the non-functional LoRAs happens to be identical to a supported convention except for that one suffix.

Yep, this doesn't seem to work.

grauho commented 2 weeks ago

Hmm could be... I'll try replacing "up" and "down" in the code with "A" and "B" and see how it goes.

Another option is just to run a quick grep on the LoRAs that do load without an issue for that pattern and then on the ones that don't and see if it's a straight one to one correlation Because of how the name preprocessing works I'm not sure that just replacing the up / down bit in model.cpp will be enough unless the convention in the non-functional LoRAs happens to be identical to a supported convention except for that one suffix.

Yep, this doesn't seem to work.

And is it the case where all the non-loading LoRAs have this *_{A,B} pattern while all the working ones do not?

grauho commented 2 weeks ago

To test this try using: grep -l "lora_[AB]" <FILES>

on the LoRAs you used for the initial run and see if the names printed correspond to the ones that didn't load. Alternatively you could run the following to try to match the files that did load: grep -E -l 'lora_up|lora_down' <FILES>

stduhpf commented 2 weeks ago

To test this try using: grep -l "lora_[AB]" <FILES>

on the LoRAs you used for the initial run and see if the names printed correspond to the ones that didn't load. Alternatively you could run the following to try to match the files that did load: grep -E -l 'lora_up|lora_down' <FILES>

Yes it matches perfectly. All those who match "lora_up|loradown" work, and all the others match "lora[AB]"

stduhpf commented 2 weeks ago

https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/lora.py#L53 Maybe we could find a solution by looking at this implementation.

grauho commented 2 weeks ago

Yeah it looks like perhaps some of the new flux LoRA training scripts have decided to use a different variant of the diffusers naming convention, probably won't be too bad to fix and will just require another conversion function during the tensor name pre-processing like with the SDXL conversion.

Edit: Might be more complex than it seemed at first blush as it seems like the flux model does some strange bundling of the q, k, and v attention heads while the LoRA handles them independently. The rest will hopefully be a simple mapping.

taotaow commented 2 weeks ago

https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SD3-8steps-CFG-lora.safetensors can't apply It's tensor name are all like transformer.*.lora_A/B.weight