Closed lxb0425 closed 1 month ago
Hi, can you clarify on the difference between deployment of "vllm from llama_factory" and "vllm from Qwen's official documentation"?
Based on the shared screenshot, it appears that you are using a custom frontend. As vllm
is not fully compatible with Qwen(1.0) models (unaware of the chat template and the stop token ids), the frontend has to at least pass stop_token_ids to the API created by vllm
. Or, you could use fastchat+vllm as introduced in the README. If you are using Qwen1.5, plain vllm should work fine.
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
大佬 我使用llama_factory 微调成功后 使用llama_factory 的vllm与使用qwen官方文档推荐的vllm方式部署 返回不一样 llama_factory vllm部署的返回都很正常 从没出过问题 千问官方vllm部署的 总是有些问题 回复的效果很差 几乎乱回答 如下图
大概什么原因啊
期望行为 | Expected Behavior
期望返回都很正常
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response