[Bug]: Model name error in vllm deployment

JulioZhao97 commented 1 week ago

Model Series

Qwen2.5

What are the models used?

Qwen2.5-7B-Instruct

What is the scenario where the problem happened?

vllm deployment with Qwen2.5-7B-Instruct

Is this a known issue?

[X] I have followed the GitHub README.
[X] I have checked the Qwen documentation and cannot find an answer there.
[X] I have checked the documentation of the related framework and cannot find useful information.
[X] I have searched the issues and there is not a similar one.

Information about environment

vllm==0.6.2

Log output

Traceback (most recent call last):
  File "/mnt/petrelfs/zhaozhiyuan/formula/Qwen2.5/vllm.py", line 11, in <module>
    chat_response = client.chat.completions.create(
  File "/mnt/petrelfs/zhaozhiyuan/anaconda3/envs/qwen2.5/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
    return func(*args, **kwargs)
  File "/mnt/petrelfs/zhaozhiyuan/anaconda3/envs/qwen2.5/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 815, in create
    return self._post(
  File "/mnt/petrelfs/zhaozhiyuan/anaconda3/envs/qwen2.5/lib/python3.10/site-packages/openai/_base_client.py", line 1277, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/mnt/petrelfs/zhaozhiyuan/anaconda3/envs/qwen2.5/lib/python3.10/site-packages/openai/_base_client.py", line 954, in request
    return self._request(
  File "/mnt/petrelfs/zhaozhiyuan/anaconda3/envs/qwen2.5/lib/python3.10/site-packages/openai/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: Error code: 404 - {'object': 'error', 'message': 'The model `Qwen2.5-7B-Instruct` does not exist.', 'type': 'NotFoundError', 'param': None, 'code': 404}

Description

Steps to reproduce

This happens to Qwen2.5-7B-Instruct The problem can be reproduced with the following steps:

deploy using the vllm vllm serve Qwen/Qwen2.5-7B-Instruct
chat completion using example code in https://github.com/QwenLM/Qwen2.5/blob/0f0ecfba609c08a3cdbb3a589bf74e36255ebd75/README.md?plain=1#L199-L224

Expected results

Retuned a model does not found error:

Attempts to fix

changemodel="Qwen2.5-7B-Instruct", to model="Qwen/Qwen2.5-7B-Instruct", fix the bug

jklj077 commented 1 week ago

Hi, this actually worked as expected. by default, vllm uses the --model (Name or path of the huggingface model to use) as the model name; if you would like to change the model name, set "--served-model-name". see https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#command-line-arguments-for-the-server for more info.

JulioZhao97 commented 1 week ago

Hi, this actually worked as expected. by default, vllm uses the --model (Name or path of the huggingface model to use) as the model name; if you would like to change the model name, set "--served-model-name". see https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#command-line-arguments-for-the-server for more info.

Thanks for reply, so why I encounter this bug? What version of vllm are you currently using?

jklj077 commented 1 week ago

if you use vllm serve Qwen/Qwen2.5-7B-Instruct, the model name is Qwen/Qwen2.5-7B-Instruct.
if you use vllm serve Qwen/Qwen2.5-7B-Instruct --served-model-name Qwen2.5-7B-Instruct, the model name is Qwen2.5-7B-Instruct.

JulioZhao97 commented 1 week ago

if you use vllm serve Qwen/Qwen2.5-7B-Instruct, the model name is Qwen/Qwen2.5-7B-Instruct.

if you use vllm serve Qwen/Qwen2.5-7B-Instruct --served-model-name Qwen2.5-7B-Instruct, the model name is Qwen2.5-7B-Instruct.

Got it! Thanks for reply.

QwenLM / Qwen2.5