microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
32.25k stars 4.7k forks source link

[Issue]: Simpler specification of nested chat #1117

Closed sonichi closed 4 months ago

sonichi commented 9 months ago

Describe the issue

Nested chat is a common way to create complex chats, meaning that an agent will reply after another conversation. It can be used in hierarchical chat, joining multiple chats etc. Examples: https://github.com/microsoft/autogen/blob/main/notebook/agentchat_chess.ipynb https://github.com/microsoft/optiguide https://github.com/microsoft/autogen/blob/main/notebook/agentchat_planning.ipynb https://github.com/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb

It may be valuable to offer syntax convenience to simplify the spec of such nested chat. As a validation, the above examples should end up with a simple and intuitive implementation.

Steps to reproduce

No response

Screenshots and logs

No response

Additional Information

No response

sidhujag commented 9 months ago

with assistants the API will timeout if another conversation takes too long no? How to deal with that?

The way I solved it was to trigger an initiate chat from within group chat so it leaves context of assistants API and runs another conversation, upon finishing get the response message and feed into parent group to continue control loop.

afourney commented 9 months ago

SocietyOfMindAgent was my attempt at this -- at least for GroupChat: https://github.com/microsoft/autogen/pull/890

I found it to be useful to be able to run the inner monologue, then do a final LLM call over the transcript to extract a final answer (see: https://github.com/microsoft/autogen/blob/society_of_mind_gaia/samples/tools/testbed/scenarios/GAIA/Templates/SocietyOfMind/scenario.py)

Without this extra call, it's very hard to know which message is contains the information for a final response, and which are just part of the termination conversation that can occur.

sidhujag commented 9 months ago

SocietyOfMindAgent was my attempt at this -- at least for GroupChat: #890

I found it to be useful to be able to run the inner monologue, then do a final LLM call over the transcript to extract a final answer (see: https://github.com/microsoft/autogen/blob/society_of_mind_gaia/samples/tools/testbed/scenarios/GAIA/Templates/SocietyOfMind/scenario.py)

Without this extra call, it's very hard to know which message is contains the information for a final response, and which are just part of the termination conversation that can occur.

That's good.. I checked out your code before. That was the problem, I couldn't find the code where it gave a response. In my code I used a function to terminate the group and thus added field to give response, in the chat manager i simply get the trigger and the final response and add it to the context of parent group. I also didn't clear the groups on every invocation (like if its an accumulative discussion between groups).

afourney commented 9 months ago

Using a function for termination like that can work in some scenarios. I don't think it would for the benchmark scenarios I've been chasing. In particular, in a fork of that code, I actually catch the context window overflow errors in the inner grouo chat and gracefully terminate the chat. Then the SocietyOfMind agent still has a chance to read the transcript and can still sometimes make a good guess or extract what it needs to form an answer. That second life adds significantly to the leaderboard performance.

sidhujag commented 9 months ago

Using a function for termination like that can work in some scenarios. I don't think it would for the benchmark scenarios I've been chasing. In particular, in a fork of that code, I actually catch the context window overflow errors in the inner grouo chat and gracefully terminate the chat. Then the SocietyOfMind agent still has a chance to read the transcript and can still sometimes make a good guess or extract what it needs to form an answer. That second life adds significantly to the leaderboard performance.

I noticed that I had another problem that kind of intersects with nested chat. I am developing a code assistance integration and I did two... one with metagpt and one with aider.. metagpt is for design, tests and coding while aider is for augmented based coding for updating, bug fixing.. these are long running tasks that are done via function calls (to send msg to code assistant).. however with assistants they will time out on these tasks so I had to engineer a solution within groups. I thought of an asyncio event scheduler which will tell agent in response that an async long running task was created and a response will come shortly.. while in group chat we can check for trigger and stop the chat.. I think maybe this can also work with nested chats as well? just kick off an asyncio task (can run in sync just in another thread) while the group chat can check for event and wait for it (should work logically in a recursive way through nested chats), i think it can work?

nancycsecon commented 2 months ago

Hi, is it possible to build a nested chat to an agent in a group chat?