vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
22.62k stars 3.19k forks source link

[Usage]: Model Qwen2ForCausalLM does not support LoRA, but LoRA is enabled. Support for this model may be added in the future. If this is important to you, please open an issue on github #3709

Open jcxcer opened 3 months ago

jcxcer commented 3 months ago

Your current environment

The output of `python collect_env.py`

How would you like to use vllm

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

jcxcer commented 3 months ago

vllm==0.3.3

jeejeelee commented 3 months ago

refer to:https://github.com/vllm-project/vllm/issues/3543