LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.99k stars 349 forks source link

A1111 local SD1.5 and SDXL pf16 compatible models #747

Closed CorentinWicht closed 6 months ago

CorentinWicht commented 6 months ago

Dear Developper,

I am trying to configure the Local Image Generation with the A1111 compatible endpoint and I am having a real hard-time finding compatible models. As specified in your Wiki, any "[...] compatible SD1.5 or SDXL .safetensors fp16 model [...]" should work, but from what I have tested so far this seems not to be the case.

Here are the models I have tested and which failed:

  1. sdxl-turbo/diffusion_pytorch_model.fp16.satefensors
  2. madebyollin/sdxl-vae-fp16-fix
  3. Stable Diffusion v1.5 [bf16/fp16] [no-ema/ema-only] [no-vae] [SafeTensors] [Checkpoint]

The error returned when running /usr/bin/python /opt/koboldcpp-latest/koboldcpp.py /home/frillm/Models/llama-2-70b-chat.Q5_K_M.gguf 8008 --multiuser 20 --highpriority --usecublas --gpulayers 60 --nommap --preloadstory /opt/settings.json --sdconfig /opt/A1111/sdXL_v10VAEFix.safetensors normal is the following:

python[8787]: ImageGen Init - Load Model: /opt/A1111/diffusion_pytorch_model.fp16.safetensors
python[8787]: Error: KCPP SD Failed to create context!

The following model loads correctly but generates black images only: sdXL_v10VAEFix.safetensors

While the only one that is currently working is the following: hollowstrawberry/sd-v1-5-pruned-noema-fp16.safetensors image

I am pretty new to this and I may have missed something ?

Best wishes,

C.

LostRuins commented 6 months ago

Hey there, thanks for pointing it out.

Okay I think maybe I need to update the wiki. The models need to be standalone and have a VAE baked into it. Unfortunately it's quite hard to tell which works by looking at the file so a bit of trial and error might be needed.

  1. The first model you linked is an incomplete model - you'll notice it's only the unet part of the model but does not include the tokenizer or vae.

  2. The second link is just a vae only, with no model associated! You'll noticed it's only 300mb. A proper model should be at least 2gb.

  3. The third link I didn't try, but it appears to not have a VAE inside too?


I should have included good examples in the wiki, these should all work:

For the original Stable Diffusion 1.5 model, try https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors , I tested it and it works.

And here's a good Anime-styled model, which I've also tested (and it works) https://huggingface.co/admruul/anything-v3.0/resolve/main/Anything-V3.0-pruned-fp16.safetensors

Could you try these and let me know if they work for you? I've updated my wiki for clarity too. Thanks!

LostRuins commented 6 months ago

Also just a heads up its probably best to run the --sdconfig with the clamped and quant flags to limit max resolution if its a shared server, since you don't want someone making a request for a massive image to make the server run out of memory and crash. That will clamp the max image resolution at 512x512

CorentinWicht commented 6 months ago

Hey there, thanks for pointing it out.

Okay I think maybe I need to update the wiki. The models need to be standalone and have a VAE baked into it. Unfortunately it's quite hard to tell which works by looking at the file so a bit of trial and error might be needed.

  1. The first model you linked is an incomplete model - you'll notice it's only the unet part of the model but does not include the tokenizer or vae.
  2. The second link is just a vae only, with no model associated! You'll noticed it's only 300mb. A proper model should be at least 2gb.
  3. The third link I didn't try, but it appears to not have a VAE inside too?

I should have included good examples in the wiki, these should all work:

For the original Stable Diffusion 1.5 model, try https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors , I tested it and it works.

And here's a good Anime-styled model, which I've also tested (and it works) https://huggingface.co/admruul/anything-v3.0/resolve/main/Anything-V3.0-pruned-fp16.safetensors

Could you try these and let me know if they work for you? I've updated my wiki for clarity too. Thanks!

Thanks you very much for your quick reply.

Then, it also came from my very limited understanding of these concepts (VAE, unet, etc.).

I can confirm that the provided original Stable Diffusion 1.5 model works perfectly, thanks.

What about SDXL models, would you have a suggestion?

Also just a heads up its probably best to run the --sdconfig with the clamped and quant flags to limit max resolution if its a shared server, since you don't want someone making a request for a massive image to make the server run out of memory and crash. That will clamp the max image resolution at 512x512

Thanks for the tips, the --sdconfig /opt/A1111/v1-5-pruned-emaonly.safetensors clamped or --sdconfig /opt/A1111/v1-5-pruned-emaonly.safetensors quant work separately, but I ran into an error if I tried to add them both --sdconfig /opt/A1111/v1-5-pruned-emaonly.safetensors clamped quant:

python[22286]: Traceback (most recent call last):
python[22286]:   File "/opt/koboldcpp-latest/koboldcpp.py", line 3070, in <module>
python[22286]:     main(parser.parse_args(),start_server=True)
python[22286]:   File "/opt/koboldcpp-latest/koboldcpp.py", line 2850, in main
python[22286]:     loadok = sd_load_model(imgmodel)
python[22286]:   File "/opt/koboldcpp-latest/koboldcpp.py", line 514, in sd_load_model
python[22286]:     sdt = int(args.sdconfig[2])
python[22286]: ValueError: invalid literal for int() with base 10: 'quant'
CorentinWicht commented 6 months ago

I actually just found a working SDXL model, but it's much ressource-heavy compared to the SD 1.5 you provided: https://civitai.com/models/117188?modelVersionId=129581

Is there possible to load two models and then be able to switch from one to another in Lite with: image

LostRuins commented 6 months ago

Nope, right now a single server instance only allows the loading of one image model - unlike the real A1111 it is not possible to swap them on the fly.

Did you load the models with the quant flag on? That helps save memory (uses 3x less!)

If you really need to host both - you will need to run 2 separate koboldcpp instances (on different ports)

Your command is missing a threads parameter - the sdconfig command takes positional arguments. The correct format should be something like --sdconfig myfile.gguf clamped 4 quant

With quant enabled correctly the SDXL should work better (may take longer to load).

CorentinWicht commented 6 months ago

Nope, right now a single server instance only allows the loading of one image model - unlike the real A1111 it is not possible to swap them on the fly.

Did you load the models with the quant flag on? That helps save memory (uses 3x less!)

If you really need to host both - you will need to run 2 separate koboldcpp instances (on different ports)

Your command is missing a threads parameter - the sdconfig command takes positional arguments. The correct format should be something like --sdconfig myfile.gguf clamped 4 quant

With quant enabled correctly the SDXL should work better (may take longer to load).

Good to know, thanks for the tips!

Alright sure I forgot the thread argument, thanks for pointing it out.

Best wishes,

C.

maxwell-kalin commented 4 months ago

i seem to get black images with any model provided in the wiki

maxwell-kalin commented 4 months ago

image

Amin456789 commented 4 months ago

use sdxl turbo and choose fix vae and compress with quant, it works this way

maxwell-kalin commented 4 months ago

i want Anything 3.0 to be honest

LostRuins commented 4 months ago

Yes that will work. Just enable the fix bad vae checkbox. Or replace the vae with a different one

maxwell-kalin commented 4 months ago

Well, its only giving me a black image (see above)

LostRuins commented 4 months ago

Ah, okay I think that was a bug in 1.66, you can try 1.65 for now that should work. It will be fixed in 1.67 which is going to be released by the end of this week.

Alternatively you can try a pre-release build of 1.67 here : https://github.com/LostRuins/koboldcpp/actions/runs/9351708061/artifacts/1563052074

LostRuins commented 4 months ago

Let me know if that works

maxwell-kalin commented 4 months ago

can you link me the linux no cuda, i dont use Windoze

BTW, SDXL works way better in my case i use this currently https://civitai.com/models/9409/or-anything-xl

LostRuins commented 4 months ago

Hi, please try the latest version which is now released

https://github.com/LostRuins/koboldcpp/releases/latest

maxwell-kalin commented 4 months ago

Works far more consistent thanks