vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
25.25k stars 3.64k forks source link

baichuan/qwen/chatlgm with lora adaption [feature] #3458

Open kexuedaishu opened 5 months ago

kexuedaishu commented 5 months ago

🚀 The feature, motivation and pitch

Popular chinese llms like baichuan/baichuan2/qwen/chatglm can be supported.

Alternatives

No response

Additional context

No response

jeejeelee commented 5 months ago

I have submitted a PR about your feature request, refer to #3382. I have tested ChatGLM-3 and Baichuan-7B models. Hope it can help you.