turboderp / exui

Web UI for ExLlamaV2
MIT License
413 stars 39 forks source link

Model download URLs? #59

Open SoftologyPro opened 1 month ago

SoftologyPro commented 1 month ago

I see this in the readme Supports EXL2, GPTQ and FP16 models but no links to the models themselves? Can you give me the HF URLs for those recommended models? Or the models you think are "best" for use with ExUI (with a 24 GB VRAM GPU)? Thanks.

turboderp commented 1 month ago

I have a bunch of EXL2 models on HF here. Other than that you can simply search for "EXL2" or "GPTQ" on HF and you should get lots of results. The FP16 format is just the standard HF format. Supported architectures should load without any quantization if you have enough VRAM.

Which model to choose depends what you're trying to do. Gemma2-27B-it is a very strong model that should work well on 24 GB GPU at about 4-5 bpw.