turboderp / exui

Web UI for ExLlamaV2
MIT License
449 stars 43 forks source link

Model download URLs? #59

Open SoftologyPro opened 4 months ago

SoftologyPro commented 4 months ago

I see this in the readme Supports EXL2, GPTQ and FP16 models but no links to the models themselves? Can you give me the HF URLs for those recommended models? Or the models you think are "best" for use with ExUI (with a 24 GB VRAM GPU)? Thanks.

turboderp commented 4 months ago

I have a bunch of EXL2 models on HF here. Other than that you can simply search for "EXL2" or "GPTQ" on HF and you should get lots of results. The FP16 format is just the standard HF format. Supported architectures should load without any quantization if you have enough VRAM.

Which model to choose depends what you're trying to do. Gemma2-27B-it is a very strong model that should work well on 24 GB GPU at about 4-5 bpw.