[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
Describe the bug
Hi!
When using the cli command:
lmdeploy serve api_server OpenGVLab/InternVL-Chat-V1-5-AWQ --backend turbomind --model-format awq
the model name returned by the API (v1/models) is: internvl-internlm2
Shouldn't it be: OpenGVLab/InternVL-Chat-V1-5-AWQ ???
But when I run the same service via docker with the following Docker command, the ID returned gets weirder:
docker run --privileged --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=" -p 23333:23333 --ipc=host lmdeploy lmdeploy serve api_server OpenGVLab/InternVL-Chat-V1-5-AWQ
then when I try the API, the ID returned is the following:
/root/.cache/huggingface/hub/models--OpenGVLab--InternVL-Chat-V1-5-AWQ/snapshots/5ce4e49fe4e5d960b62a619b268113e40943c57f
As this Id is used in the web UI I'm using to show the model name, I'd rather have the normal short name, but while using Docker.
Reproduction
1.start lmdeploy using Docker recipe above
2.open browser to check fast api web page (http://{serverIP}:23333)
3.test the GET v1/models endpoint
4.Notice that the object.data[0].id is not the normal short name
Checklist
Describe the bug
Hi! When using the cli command: lmdeploy serve api_server OpenGVLab/InternVL-Chat-V1-5-AWQ --backend turbomind --model-format awq the model name returned by the API (v1/models) is: internvl-internlm2
Shouldn't it be: OpenGVLab/InternVL-Chat-V1-5-AWQ ???
But when I run the same service via docker with the following Docker command, the ID returned gets weirder:
dockerfile:
FROM openmmlab/lmdeploy:latest
RUN apt-get update && apt-get install -y python3 python3-pip git WORKDIR /app
RUN pip3 install --upgrade pip RUN pip3 install timm RUN pip3 install flash-attn --no-build-isolation
CMD ["lmdeploy", "serve", "api_server", "OpenGVLab/InternVL-Chat-V1-5-AWQ", "--backend", "turbomind", "--model-format", "awq"]
docker build --tag 'lmdeploy' .
docker run --privileged --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=" -p 23333:23333 --ipc=host lmdeploy lmdeploy serve api_server OpenGVLab/InternVL-Chat-V1-5-AWQ
then when I try the API, the ID returned is the following: /root/.cache/huggingface/hub/models--OpenGVLab--InternVL-Chat-V1-5-AWQ/snapshots/5ce4e49fe4e5d960b62a619b268113e40943c57f
As this Id is used in the web UI I'm using to show the model name, I'd rather have the normal short name, but while using Docker.
Reproduction
1.start lmdeploy using Docker recipe above 2.open browser to check fast api web page (http://{serverIP}:23333) 3.test the GET v1/models endpoint 4.Notice that the object.data[0].id is not the normal short name
Environment
Error traceback
No response