InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
4.52k stars 407 forks source link

[Bug] 通过lmdeploy上线 Qwen-vl及其lora,但检查后发现lora并没有上线成功 #2297

Closed NONGFUYULANG closed 2 months ago

NONGFUYULANG commented 2 months ago

Checklist

Describe the bug

通过lmdeploy上线 Qwen-vl及其lora,但检查后发现lora并没有上线成功,模型和lora权重均在本地,但模型无法使用

Reproduction

lmdeploy serve api_server F:\kelperliu\docker_data\models--Qwen--Qwen-VL-Chat\snapshots\f57cfbd358cb56b710d963669ad1bcfb44cdcdd8 --adapters mylora=F:\kelperliu\docker_data\ft_local\Qwen-VL-master\Qwen-VL-master\first_type_building another=F:\kelperliu\docker_data\ft_local\Qwen-VL-master\Qwen-VL-master\forth_type_building

Environment

torch
lmdeploy-0.5.3

Error traceback

No response

irexyc commented 2 months ago

lmdeploy 有两个推理后端 pytorch / torbumind

Qwen-VL 是在 turbomind 后端进行支持的,这个后端不支持同时加载原模型和lora权重,只能先自行merge一下。

NONGFUYULANG commented 2 months ago

目前只支持torch后端的lora模型是吗

irexyc commented 2 months ago

目前只支持torch后端的lora模型是吗

对。turbomidn 后端需要自己先merge一下。