InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
https://xtuner.readthedocs.io/zh-cn/latest/
Apache License 2.0
3.93k stars 308 forks source link

llava-llama-3-8b-v1_1-hf How to quantify awq? #699

Open goodnight654 opened 5 months ago

goodnight654 commented 5 months ago

I have never been able to correctly quantify awq for llava-llama3 in the official format of llava。 Can anyone help me?

pppppM commented 5 months ago

autoawq appears to support quantization for llava models, have you tried it?

We are developing a quantized version of the VL model in lmdeploy, but it won't be released until next week.