leejet / stable-diffusion.cpp

Stable Diffusion and Flux in pure C/C++
MIT License
3.33k stars 280 forks source link

SDXL : LoRa problem #203

Closed piallai closed 6 months ago

piallai commented 6 months ago

When I use the following command:

sd.exe -m ..\models\sd_xl_base_1.0.safetensors --vae ..\models\sdxl_vae.safetensors --lora-model-dir ..\models -H 1024 -W 1024 -p "a lovely cat<lora:sd_xl_offset_example-lora_1.0:1>" -v

The lora model apparently can not be used:

[INFO ] model.cpp:705  - load ..\models/sd_xl_offset_example-lora_1.0.safetensors using safetensors format
[DEBUG] model.cpp:771  - init from '..\models/sd_xl_offset_example-lora_1.0.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from '..\models/sd_xl_offset_example-lora_1.0.safetensors'
[DEBUG] model.cpp:1343 - loading tensors from ..\models/sd_xl_offset_example-lora_1.0.safetensors
[DEBUG] ggml_extend.hpp:884  - lora params backend buffer size =  47.01 MB(VRAM) (10240 tensors)
[DEBUG] model.cpp:1343 - loading tensors from ..\models/sd_xl_offset_example-lora_1.0.safetensors
[DEBUG] lora.hpp:74   - finished loaded lora
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_emb_layers_1.alpha
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_emb_layers_1.lora_down.weight
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_emb_layers_1.lora_up.weight
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_in_layers_2.alpha
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_in_layers_2.lora_down.weight
[WARN ] lora.hpp:160  - unused lora tensor lora.unet_input_blocks_1_0_in_layers_2.lora_up.weight
...
(hundreds of same warnings)

It's the same problem as https://github.com/leejet/stable-diffusion.cpp/pull/117#issuecomment-1856518605. The lora model is eventually not used at all.

I'm using the latest master-a469688 release.

grauho commented 6 months ago

I had the same issue and addressed it in my pending pull request #200

From what I can tell it is because the SDXL LoRAs use a slightly different naming convention that the current code isn't set up to properly convert to the internally used convention. Also, it seems like the existing memory allocated for the GGML graph is insufficient to accommodate adding a SDXL LoRA so I had to bump that up as well.

Green-Sky commented 6 months ago

Now it is crashing with

ggml_cuda_op_bin_bcast: unsupported types: dst: q8_0, src0: q8_0, src1: f32

probably because loras contains some f32 AND q8_0+f32->q8_0 is not supported. conversion seems to work but its not loading it (only looking for safetensors and ckpt)

grauho commented 6 months ago

Interesting, I'm not having that problem, what invocation are you using?

Green-Sky commented 6 months ago

forgot to mention the obvious: I am using a model that is converted to q8_0. Maybe you can reproduce using --type q8_0. I did not test, but I think this is not SDXL specific and was always like this.

grauho commented 6 months ago

I assumed as much, and this only happens when you use a quantized LoRA and not just the quantized model, or both?

grauho commented 6 months ago

I am able to use --type q8_0 on an SDXL model and SDXL LoRA without incident

Green-Sky commented 6 months ago

the model is always quantized. Lora cant be quantized rn.

sad, I thought I could have memory savings and deleted the .safetensors models <.<

grauho commented 6 months ago

Alright, in that case it sounds like a separate quantization issue distinct from this one. I propose that this issue be marked as resolved.

piallai commented 6 months ago

I tried with the new release master-48bcce4. The program now crashes when loading LoRa. Here is the log, with the same command as in the OP.

[INFO ] stable-diffusion.cpp:553  - Attempting to apply 1 LoRAs
[INFO ] model.cpp:726  - load ..\models\sd_xl_offset_example-lora_1.0.safetensors using safetensors format
[DEBUG] model.cpp:792  - init from '..\models\sd_xl_offset_example-lora_1.0.safetensors'
[INFO ] lora.hpp:38   - loading LoRA from '..\models\sd_xl_offset_example-lora_1.0.safetensors'
[DEBUG] model.cpp:1364 - loading tensors from ..\models\sd_xl_offset_example-lora_1.0.safetensors
Assertion failed: n_dims >= 1 && n_dims <= GGML_MAX_DIMS, file C:\***\ggml\src\ggml.c, line 2745

With n_dims valued 0. Not exactly the same problem, but still related to LoRa loading. I suppose it's worth leaving the issue open.

grauho commented 6 months ago

In my opinion, it might better to close this issue and make that new problem it's own issue with a more descriptive name so that other people having the same issue or those with a solution can find it more easily as it does not seem to be related to issue in the original post. Just to avoid those reading this issue never scrolling down and seeing that someone is in fact having the same issue they are.

bssrdf commented 6 months ago

I am still seeing some lora's not being applied even with fix from https://github.com/leejet/stable-diffusion.cpp/pull/200 The lora wight file is xl_more_art-full_v1.safetensors, a very popular one.

 bin/sd -m ../models/sdxlUnstableDiffusers_v11.safetensors  --vae ../models/sdxl_vae.safetensors  --lora-model-dir ../models --stacked-id-embd-dir ../models/photomaker-v1.safetensors --input-id-images-dir examples/scarletthead_woman -p "a girl img, retro futurism, retro game art style but extremely beautiful, intricate details, masterpiece, best quality, space-themed, cosmic, celestial, stars, galaxies, nebulas, planets, science fiction, highly detailed <lora:xl_more_art-full_v1:0.5>" -n "realistic, photo-realistic, worst quality, greyscale, bad anatomy, bad hands, error, text" --cfg-scale 5.0  --sampling-method euler -H 1024 -W 10
24 --normalize-input -o scarrlett11.png
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
[INFO ] stable-diffusion.cpp:171  - loading model from '../models/sdxlUnstableDiffusers_v11.safetensors'
[INFO ] model.cpp:726  - load ../models/sdxlUnstableDiffusers_v11.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:182  - loading vae from '../models/sdxl_vae.safetensors'
[INFO ] model.cpp:726  - load ../models/sdxl_vae.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:194  - Stable Diffusion XL
[INFO ] stable-diffusion.cpp:200  - Stable Diffusion weight type: f16
[INFO ] model.cpp:726  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] lora.hpp:39   - loading LoRA from '../models/photomaker-v1.safetensors'
[INFO ] stable-diffusion.cpp:281  - loading stacked ID embedding (PHOTOMAKER) model file from '../models/photomaker-v1.safetensors'
[INFO ] model.cpp:726  - load ../models/photomaker-v1.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:406  - total params memory size = 7182.38MB (VRAM 7182.38MB, RAM 0.00MB): clip 1564.36MB(VRAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 623.48MB(VRAM)
[INFO ] stable-diffusion.cpp:425  - loading model from '../models/sdxlUnstableDiffusers_v11.safetensors' completed, taking 1.42s
[INFO ] stable-diffusion.cpp:442  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:1578 - PhotoMaker loaded image from 'examples/scarletthead_woman/scarlett_0.jpg'
[INFO ] stable-diffusion.cpp:1578 - PhotoMaker loaded image from 'examples/scarletthead_woman/scarlett_1.jpg'
[INFO ] stable-diffusion.cpp:1578 - PhotoMaker loaded image from 'examples/scarletthead_woman/scarlett_2.jpg'
[INFO ] stable-diffusion.cpp:1578 - PhotoMaker loaded image from 'examples/scarletthead_woman/scarlett_3.jpg'
[INFO ] stable-diffusion.cpp:553  - Attempting to apply 1 LoRAs
[INFO ] model.cpp:726  - load ../models/xl_more_art-full_v1.safetensors using safetensors format
[INFO ] lora.hpp:39   - loading LoRA from '../models/xl_more_art-full_v1.safetensors'
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.alpha
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.lora_down.weight
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.lora_up.weight
[WARN ] lora.hpp:174  - Only (2361 / 2364) LoRA tensors have been applied
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.alpha
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.lora_down.weight
[WARN ] lora.hpp:165  - unused lora tensor lora.model_diffusion_model_output_blocks_2_2_conv.lora_up.weight
[WARN ] lora.hpp:174  - Only (2361 / 2364) LoRA tensors have been applied
[INFO ] stable-diffusion.cpp:530  - lora 'xl_more_art-full_v1' applied, taking 1.22s
[INFO ] stable-diffusion.cpp:1608 - apply_loras completed, taking 1.22s
[INFO ] stable-diffusion.cpp:1615 - pmid_lora apply completed, taking 0.05s
[INFO ] stable-diffusion.cpp:1679 - Photomaker ID Stacking, taking 129 ms
[INFO ] stable-diffusion.cpp:1688 - sampling steps increases from 20 to 50 for PHOTOMAKER
[INFO ] stable-diffusion.cpp:1719 - get_learned_condition completed, taking 100 ms
[INFO ] stable-diffusion.cpp:1735 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1739 - generating image: 1/1 - seed 42
[INFO ] stable-diffusion.cpp:1752 - PHOTOMAKER: start_merge_step: 10
  |==================================================| 50/50 - 2.09it/s
[INFO ] stable-diffusion.cpp:1776 - sampling completed, taking 24.25s
[INFO ] stable-diffusion.cpp:1784 - generating 1 latent images completed, taking 24.27s
[INFO ] stable-diffusion.cpp:1786 - decoding 1 latents
[INFO ] stable-diffusion.cpp:1796 - latent 1 decoded, taking 0.99s
[INFO ] stable-diffusion.cpp:1800 - decode_first_stage completed, taking 0.99s
[INFO ] stable-diffusion.cpp:1817 - txt2img completed in 25.36s
save result image to 'scarrlett11.png'
grauho commented 6 months ago

Have you verified that the corresponding tensor exists in the model you are using?

bssrdf commented 6 months ago

Have you verified that the corresponding tensor exists in the model you are using?

I have used this model file a while ago and it had no issue unless UNET changed since then (unlikely). I am wondering if this is due to the change introduced with PhotoMaker PR https://github.com/leejet/stable-diffusion.cpp/pull/179 . @leejet did a nice job of consolidating vanilla Lora and Photomaker Lora.

grauho commented 6 months ago

I'm not familiar with anything to do with photomaker but I would recommend checking out the model to make sure that the corresponding tensor is in fact present, as just because it didn't warn you of this before doesn't mean it wasn't an issue.

bssrdf commented 6 months ago

I found commenting out these lines will fix my issue but I assume it will not work for other models. https://github.com/leejet/stable-diffusion.cpp/blob/48bcce493f45a11d9d5a4c69943d03ff919d748f/lora.hpp#L93-L95

I am using:

grauho commented 6 months ago

That particular addition was not from my pull request and I would be curious to know the rational behind it.

piallai commented 6 months ago

Fixed in the new release master-90e9178 Thanks!