Open qingjiaozyn opened 8 months ago
As shown in the error message, not supported
May I ask when this problem will be resolved and is there a plan
Hey, I have also encountered the same problem as you. May I ask when I can integrate qwen lora model loading? I would be appreciated if you solved this problem and let me know. I‘m concerned about this problem at present, thx.
I also encountered the same situation, when can qwen support lora model
Pl enable support for this. @vllm team
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
I use the multi-LoRA for offline inference: sql_lora_path = "/home/zyn/models/slot_lora_gd"
from vllm import LLM, SamplingParams from vllm.lora.request import LoRARequest
llm = LLM(model="/home/models/dem_14b/base", enable_lora=True, trust_remote_code=True)
sampling_params = SamplingParams(temperature=0, max_tokens=256, stop=["[/assistant]"])
prompts = [ "[user] Write a SQL query to answer the question based on the table schema.\n\n context: CREATE TABLE table_name_74 (icao VARCHAR, airport VARCHAR)\n\n question: Name the ICAO for lilongwe international airport [/user] [assistant]", "[user] Write a SQL query to answer the question based on the table schema.\n\n context: CREATE TABLE table_name_11 (nationality VARCHAR, elector VARCHAR)\n\n question: When Anchero Pantaleone was the elector what is under nationality? [/user] [assistant]", ]
outputs = llm.generate(prompts, sampling_params, lora_request=LoRARequest("sql_adapter", 1, sql_lora_path))
File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 109, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 391, in from_engine_args engine = cls(*engine_configs, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 128, in init self._init_workers() File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 181, in _init_workers self._run_workers("load_model") File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 1041, in _run_workers driver_worker_output = getattr(self.driver_worker, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/worker/worker.py", line 100, in load_model self.model_runner.load_model() File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 88, in load_model self.model = get_model(self.model_config, File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/model_executor/utils.py", line 52, in get_model return get_model_fn(model_config, device_config, **kwargs) File "/root/miniconda3/envs/qwen/lib/python3.10/site-packages/vllm/model_executor/model_loader.py", line 73, in get_model raise ValueError( ValueError: Model QWenLMHeadModel does not support LoRA, but LoRA is enabled. Support for this model may be added in the future. If this is important to you, please open an issue on github.