leejet / stable-diffusion.cpp

Stable Diffusion and Flux in pure C/C++
MIT License
3.14k stars 264 forks source link

Getting always black image using XL models #167

Open FrankEscobar opened 7 months ago

FrankEscobar commented 7 months ago

I've tried with sd_xl_base_1.0.safetensors and sd_xl_turbo_1.0_fp16.safetensors, using CPU and GPU, trying f32, f16...

But I always get black images even using as simple command lines as ./sd.exe --model sd_xl_turbo_1.0_fp16.safetensors --prompt "a dog" --width 512 --height 512 --steps 1 --output output0.png

I have no issues with other models like 1.5 or 2.1, thank you!

FSSRepo commented 7 months ago

download this vae sdxl-vae-fix, and add the argument --vae sdxl_vae.safetensors

FrankEscobar commented 6 months ago

Great thank you!

to be honest I'm a bit lost about the VAE, emaonly, noema... when you should use one or another etc.

And if you search for that in google most of the things are just some confusing discussions on Reddit.

Btw I expected a quality gain using XL but the results are pretty unrealistic, I tried with a Lora too but it didn't work for me (it works for Loras and 1.5/2.1)

This is the shown warning:

STD OUT [INFO ] stable-diffusion.cpp:141  - loading model from 'sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:645  - load sd_xl_base_1.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:152  - loading vae from 'sdxl_vae.safetensors'
[INFO ] model.cpp:645  - load sdxl_vae.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:169  - Stable Diffusion XL 
[INFO ] stable-diffusion.cpp:175  - Stable Diffusion weight type: f16
[INFO ] stable-diffusion.cpp:276  - total memory buffer size = 6570.67MB (clip 1568.77MB, unet 4903.43MB, vae 98.47MB)
[INFO ] stable-diffusion.cpp:278  - loading model from 'sd_xl_base_1.0.safetensors' completed, taking 6.21s
[INFO ] stable-diffusion.cpp:292  - running in eps-prediction mode
[INFO ] model.cpp:645  - load Dressed animals XL.safetensors using safetensors format
[INFO ] lora.hpp:35   - loading LoRA from 'Dressed animals XL.safetensors'
[INFO ] stable-diffusion.cpp:405  - lora 'Dressed animals XL' applied, taking 1.15s
[INFO ] stable-diffusion.cpp:1233 - apply_loras completed, taking 1.15s
[INFO ] stable-diffusion.cpp:1272 - get_learned_condition completed, taking 389 ms
[INFO ] stable-diffusion.cpp:1288 - sampling using modified DPM++ (2M) method
[INFO ] stable-diffusion.cpp:1292 - generating image: 1/1 - seed 870920193
  |==================================================| 50/50 - 1.93it/s
[INFO ] stable-diffusion.cpp:1304 - sampling completed, taking 26.97s
[INFO ] stable-diffusion.cpp:1312 - generating 1 latent images completed, taking 27.03s
[INFO ] stable-diffusion.cpp:1314 - decoding 1 latents
[INFO ] stable-diffusion.cpp:1324 - latent 1 decoded, taking 0.68s
[INFO ] stable-diffusion.cpp:1328 - decode_first_stage completed, taking 0.68s
[INFO ] stable-diffusion.cpp:1347 - txt2img completed in 28.10s

STD ERROR ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4070, compute capability 8.9, VMM: yes
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.alpha
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.lora_down.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.lora_up.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.alpha
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.lora_down.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.lora_up.weight
....

Regards.