Open xiaobo-Chen opened 6 months ago
The version of vllm I used is latest version, which is 0.3.3
Qwen2 support LoRA. this error raised by lora_config .You can check your local qwen arch
The version of vllm I used is latest version, which is 0.3.3
In version 0.33, Qwen2 indeed does not support LoRA.
The documentation by default points to main branch. In upcoming release, or build from source, you can use Qwen2 w/ LoRA
can anyone show a case of qwen1.5 inference with lora by vllm?
The version of vllm I used is latest version, which is 0.3.3
In version 0.33, Qwen2 indeed does not support LoRA.
can anyone show a case of qwen1.5 inference with lora by vllm?
The version of vllm I used is latest version, which is 0.3.3
In version 0.33, Qwen2 indeed does not support LoRA.
Please refer to multilora_inference.
The documentation by default points to main branch. In upcoming release, or build from source, you can use Qwen2 w/ LoRA
@simon-mo Hi, now in main branch, chatglm3 and baichuan have supported LoRA, but the documentation don't display this feature yet, how to add it in documentation
Please send a PR editing this file: https://github.com/vllm-project/vllm/blob/main/docs/source/models/supported_models.rst
Your current environment
Collecting environment information... PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.6 LTS (x86_64) GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.27
Python version: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.4.0-150-generic-x86_64-with-glibc2.27 Is CUDA available: True CUDA runtime version: 11.4.120 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 GPU 1: NVIDIA GeForce RTX 3090
Nvidia driver version: 535.113.01
How would you like to use vllm
I try to run the following commond:
python -m vllm.entrypoints.openai.api_server \ --model /home/T3090U1/CZ/model/Qwen1.5-7B-Chat/ \ --enable-lora \ --lora-modules sql-lora=/home/T3090U1/CZ/model/output_sft_qwen_0320/
error message: ValueError: Model Qwen2ForCausalLM does not support LoRA, but LoRA is enabled. Support for this model may be added in the future.
However, the document of vllm https://docs.vllm.ai/en/latest/models/supported_models.html said the Qwen2ForCausalLM is support lora.
I use the wrong commond or Qwen2ForCausalLM for lora is not supported?