QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
7.43k stars 454 forks source link

使用vllm部署qwen1.5后推理回答结果会出现截断 #752

Closed zhangshuyx closed 11 hours ago

zhangshuyx commented 2 months ago
image

输入参数为:

{
  "model": "Qwen1_5_72B_Chat",
  "messages": [{"role": "user","content": "请给出一篇500字的中学作文,讲述海边游玩的经历"}],
  "max_tokens": 2000, 
  "stop": []
}

也尝试了不同的参数,但是不管输出内容长短 都会截断,导致回答不完整 最明显的是这个回答,我要求列出1到20的所有数字 输出也是截断:

image

输入参数为:

{
  "model": "Qwen1_5_72B_Chat",
  "messages": [{"role": "user","content": "请列出1到20的所有数字"}],
  "max_tokens": 1200, 
  "stop":["<|im_end|>", "<|endoftext|>", "<|im_start|>"],
  "stream":"False",
  "temperature":0.7
}
jklj077 commented 2 months ago

the completion_tokens in usage did not match the number of tokens in choices[0].messages.content. it appears some tokens are lost. could you try reporting the issue to vllm?

yoke233 commented 1 month ago

最新的 vllm==0.5.3.post1 遇到同样问题, 只能回退到 之前使用的版本vllm==0.4.0.post1

github-actions[bot] commented 1 week ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.