[Feature Request] Add message queue between model clients and vllm or ollama servers

camel-ai / camel

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

https://docs.camel-ai.org/

Apache License 2.0

5.71k stars 695 forks source link

[Feature Request] Add message queue between model clients and vllm or ollama servers #747

Open lightaime opened 4 months ago

lightaime commented 4 months ago

Required prerequisites

[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[ ] Consider asking first in a Discussion.

Motivation

For load balance between multiple model clients and vllm or ollama servers:

[ ] Add Kafka: https://kafka.apache.org/08/documentation.html

The message queue abstraction can be also used for workforce or task assignment. We should take this into consideration.

Solution

No response

Alternatives

No response

Additional context

No response

Asher-hss commented 4 months ago

Perhaps RabbitMQ is also a good choice?

Wendong-Fan commented 3 months ago

from Guohao: we can use message queue to do message exchange and load balance in CAMEL the design maybe a little bit different based on the case

Wendong-Fan commented 3 months ago

Agent - Agent Task -Agent Agent - Model

the queue is independent and will not expose to user

lightaime commented 3 months ago

A good design may be

AgentModelLoaderBalancer(model_server_urls: List[str]) -> loader_balancer_url: str

Wendong-Fan commented 3 months ago

can make it as an independent package

sfc-gh-yihuang2 commented 3 months ago

Could you please provide more context about the feature and its requirements? I have the following questions:

What is the size of the messages (average or maximum, or estimation)?
What kind of message ordering is required? For example, total ordering across agents, per-agent ordering, or is ordering not a concern?
What level of message delivery guarantee is needed (e.g., at least once, at most once, exactly once)?

Additionally, could someone please explain the abstraction of 'Task' to me? I would really appreciate it.

Thanks!

Appointat commented 3 months ago

@sfc-gh-yihuang-2 Hello, thank you for your interest in this issue. Since this feature is still under research and design, some information may not be accurate:

The number of messages will far exceed what the limited llm models can "evenly" handle (it will be on a much larger scale).
Message order is based on priority (msg priority or agent priority) and some other optimization algorithms.
We prefer "exactly once," but theoretically, we should ensure at least once.

The Task is just a task that needs to be solved by the agents.