leejet / stable-diffusion.cpp

Stable Diffusion and Flux in pure C/C++
MIT License
3.43k stars 294 forks source link

T5xxl gguf support #369

Open KintCark opened 2 months ago

KintCark commented 2 months ago

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

SkutteOleg commented 2 months ago

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

It is supported, try the quants made specifically for sd.cpp: https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/tree/main

Green-Sky commented 2 months ago

@pilot5657 what are those binaries supposed to "fix"?

phudtran commented 2 months ago

@pilot5657 what are those binaries supposed to "fix"?

They're bots. I saw the same message from another account in an unrelated thread.

offbeat-stuff commented 2 months ago

can moderator please remove these links to programs that are very likely viruses

grauho commented 2 months ago

I tried using t5xxl q3 gguf but it's not supported please add t5xxl gguf support.

Also does flux gguf get qauntized again? I tried loading it but it said it's a fp16 model? I couldn't load it until I added type q2_k also please make it where we don't have to keep reqauntizing the models every generation cuz flux takes 30minutes to qauntize.

You can use the 'convert' mode to quantize your model and save it out to a .gguf file that you can then load instead of using --type to quantize at model load time at the start of each run. Take a look in the documentation to see an example of how this works: https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/quantization_and_gguf.md