💡 启动open_api.py 接口服务是串行的问题

huyang19881115 commented 7 months ago

起始日期 | Start Date

No response

实现PR | Implementation PR

No response

摘要 | Summary

通过接口调用 open_api,py 启动的服务，发现多次请求都是排队执行的，串行的，但是通过fastchat 启动加载qwen 模型，发现是并行的返回数据的，就想问下，如何可以让open_api,.py 启动的模型改为并行的支撑用户请求。

目前发现还有个小问题，open_api.py 启动支持流式请求返回，非流式报错，fastchat 启动支持非流式请求，流式请求报错

基本示例 | Basic Example

希望可以与fastchat 类似支撑并发的请求

缺陷 | Drawbacks

并发请求的问题

未解决问题 | Unresolved questions

非流式请求报错的问题

WangJianQ-cmd commented 7 months ago

我之前也遇到了这个问题 Qwen官方的那个Openapi.py需要等大模型这个输出一直输出完才能下一个输出。后来换成vLLM了

jklj077 commented 7 months ago

We'd like to clarify that the API endpoint created by openai_api.py in this repository is intended primarily as a demonstration and should not be used for production purposes. It currently does not handle concurrent requests efficiently, which may impact its performance.

我们在此明确，此存储库中由openai_api.py创建的API接口主要用于演示目的，并不推荐在生产环境中使用。目前，它无法高效处理并发请求，可能会影响其性能。

For deploying an API in a live setting, we recommend leveraging backend solutions like fastchat+vllm for Qwen(1.0), as these platforms are built with concurrency management features that ensure they can handle high traffic loads effectively.

对于在实际环境中部署API，我们建议采用类似fastchat+vllm这样的后端解决方案来支持Qwen(1.0)，因为这些平台内置了并发管理功能，可以有效应对高流量负载。

Should you need help with tool functionalities or specific function calls, feel free to contact qwen-agent. For any issues pertaining specifically to fastchat, kindly seek assistance through the relevant support channels provided by fastchat.

如果您在工具功能或特定函数调用方面需要帮助，请随时联系qwen-agent。关于fastchat的具体问题，请通过fastchat提供的相关支持渠道寻求帮助。

Additionally, please be aware that Qwen(1.0) models along with their associated codebase have been discontinued for active development and feature enhancements. This means they remain operational but won't receive further updates. We suggest considering current alternatives for your projects to guarantee continued support and compatibility.

另外请注意，Qwen(1.0)模型及其关联代码库已停止进行积极开发和功能增强。这意味着它们仍可运行，但不会获得进一步更新。我们建议您考虑当前的替代方案，以确保您的项目能够持续得到支持并保持兼容性。

QwenLM / Qwen