vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
23.36k stars 3.33k forks source link

ValueError: Model QWenLMHeadModel does not support LoRA, but LoRA is enabled. Support for this model may be added in the future. If this is important to you, please open an issue on github. #3199

Open qingjiaozyn opened 4 months ago

qingjiaozyn commented 4 months ago

sql_lora_path = "/home/zyn/models/slot_lora_gd"

from vllm import LLM, SamplingParams from vllm.lora.request import LoRARequest

llm = LLM(model="/home/models/dem_14b/base", enable_lora=True, trust_remote_code=True)

sampling_params = SamplingParams(temperature=0, max_tokens=256, stop=["[/assistant]"])

prompts = [ "[user] Write a SQL query to answer the question based on the table schema.\n\n context: CREATE TABLE table_name_74 (icao VARCHAR, airport VARCHAR)\n\n question: Name the ICAO for lilongwe international airport [/user] [assistant]", "[user] Write a SQL query to answer the question based on the table schema.\n\n context: CREATE TABLE table_name_11 (nationality VARCHAR, elector VARCHAR)\n\n question: When Anchero Pantaleone was the elector what is under nationality? [/user] [assistant]", ]

outputs = llm.generate(prompts, sampling_params, lora_request=LoRARequest("sql_adapter", 1, sql_lora_path))

llm = LLM(model="/home/models/dem_14b/base", File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 109, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 391, in from_engine_args engine = cls(*engine_configs, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 128, in init self._init_workers() File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 181, in _init_workers self._run_workers("load_model") File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 1041, in _run_workers driver_worker_output = getattr(self.driver_worker, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/worker/worker.py", line 100, in load_model self.model_runner.load_model() File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 88, in load_model self.model = get_model(self.model_config, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/model_executor/utils.py", line 52, in get_model return get_model_fn(model_config, device_config, **kwargs) File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/model_executor/model_loader.py", line 73, in get_model raise ValueError( ValueError: Model QWenLMHeadModel does not support LoRA, but LoRA is enabled. Support for this model may be added in the future. If this is important to you, please open an issue on github.

BillFang12 commented 4 months ago

I also encountered the same situation, when can qwen support lora model