MrZilinXiao / openai-manager

Speed up your OpenAI requests by balancing prompts to multiple API keys.
51 stars 9 forks source link

Questions about using this project in a Flask app #3

Open F1mc opened 1 year ago

F1mc commented 1 year ago

I am thrilled to have discovered this project - it has been tremendously helpful in dealing with the high volume of access requirements. However, it seems that the serving mode in the openai-manager project does not currently implement the ChatCompletion feature? This is my primary expected method of invocation.

In addition, I have built a simple Flask reverse proxy app for user access control and user-level usage control of API calls. Therefore, I'd like to modify my code to directly use the openai-manager within this Flask app to realize load-balancing for multiple APIs.

Being not very familiar with asynchronous programming in Python, I have a couple of questions I'd like to ask:

  1. In serving.py, does the GLOBAL_MANAGER only control the list of tasks submitted in a single submission, or all requests submitted over multiple submissions? In other words, can the current serving implementation properly handle multiple concurrent requests from a single source?

  2. Is it feasible to use asyncio.run() to call the submission function directly within a Flask app enabled for multi-threading?

Thank you in advance for your time and help.

MrZilinXiao commented 1 year ago

Thanks for your interest! And yes, ChatCompletion is now only available for python package usage.

For your question:

  1. I would recommend you use a message queue backed with Redis for your specific usage, as this project only considers requests from ONE source.
  2. I am not sure the current Flask design allows calling external async functions. But yes, the most simple (but not elegant) way to work around it is to start a separate process for openai-manager.