Open ZX1998-12 opened 2 weeks ago
Qwen2.5
qwen2.5-72B
抢占式实例部署qwen2.5-72B成功,调用失败
部署指令:vllm serve /home/Qwen2.5/Qwen2.5-72B-Instruct --port 6666 --host 0.0.0.0 --tensor-parallel-size 4 --served-model-name Qwen2.5-72B --enforce-eager
部署成功但是调用失败截图
应该是和MQLLMEngine交互数据超时了,但是不知道解决办法
This happens to Qwen2.5-xB-Instruct-xxx and xxx. The badcase can be reproduced with the following steps:
The following example input & output can be used:
system: ... user: ... ...
The results are expected to be ...
I have tried several ways to fix this, including:
I find that this problem also happens to ...
for vllm internal errors, I advised you to raise issues at https://github.com/vllm-project/vllm/issues
Model Series
Qwen2.5
What are the models used?
qwen2.5-72B
What is the scenario where the problem happened?
抢占式实例部署qwen2.5-72B成功,调用失败
Is this badcase known and can it be solved using avaiable techniques?
Information about environment
部署指令:vllm serve /home/Qwen2.5/Qwen2.5-72B-Instruct --port 6666 --host 0.0.0.0 --tensor-parallel-size 4 --served-model-name Qwen2.5-72B --enforce-eager
部署成功但是调用失败截图
应该是和MQLLMEngine交互数据超时了,但是不知道解决办法
Description
Steps to reproduce
This happens to Qwen2.5-xB-Instruct-xxx and xxx. The badcase can be reproduced with the following steps:
The following example input & output can be used:
Expected results
The results are expected to be ...
Attempts to fix
I have tried several ways to fix this, including:
Anything else helpful for investigation
I find that this problem also happens to ...