TheBlokeAI / dockerLLM

TheBloke's Dockerfiles
MIT License
299 stars 59 forks source link

GPTQ-for-LLaMa support #4

Closed virgulvirgul closed 1 year ago

virgulvirgul commented 1 year ago

Do you have any plans to add GPTQ-for-LLaMa?

TheBloke commented 1 year ago

I hadn't planned to, no. It'd be quite easy to do so - does it do something that AutoGPTQ or ExLlama doesn't? I haven't used GPTQ-for-LLaMa myself in months as I use ExLlama for 4-bit Llama models, and AutoGPTQ otherwise.

virgulvirgul commented 1 year ago

It provides 4-bits quantization solution for the "FLAN-T5" series and UL2 as well. Do you know of any other solution that does that? I don't have much experience with ExLlama.