[Bug]: deploy multi lora by vllm mode error

Your current environment

Env: cuda11.8 vllm 0.4.3

🐛 Describe the bug

in T4 and vllm version==0.4.3 deploy multi lora by vllm failed, error info: RuntimeError: CUDA error: no kernel image is available for execution on the device. my deploy command: CUDA_VISIBLE_DEVICES=0,1,2,3 swift deploy --tensor_parallel_size 4 --dtype fp16 --model_type qwen1half-7b-chat --model_id_or_path /cloud/user/data/data0806/llm/M2/Chat_New --ckpt_dir /cloud/user/data/data0806/llm/M2/checkpoint-200/ --infer_back vllm -- vllm_enable_lora true --max_model_len 512 --enforce_eager I tried to vllm to 0.5.5, there is still error

Before submitting a new issue...

[X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

vllm-project / vllm