ag2ai / ag2

AG2 (formerly AutoGen): The Open-Source AgentOS. Join the community at: https://discord.gg/pAbnFJrkgZ
https://ag2ai.github.io/ag2/
Apache License 2.0
620 stars 59 forks source link

[Feature Request]: More controllable swarm agent #77

Open linmou opened 1 day ago

linmou commented 1 day ago

Feature-related issues

Issues: 
1. Agent system message is not updated with context variables.
2. The pipeline is not so controllable. Agents may not call their functions properly. Group-level customized Afterwork function can not solve all the problems. Especially when there are more than 2 functions, 
    for example, the response module's pipeline is like: 1. get context variables; 2. response to user's latest message; 3. hand off to emotion_reaction_module. 
    It is really hard to customize the pipeline.
3. Some groupchat information, e.g. chat history & other agents, are not available for the registered tool calls. While agents needs these information to make decisions
4.  Group-level customized Afterwork ( _after_work_ in **initiate_swarm_chat**)  can not get access to groupchat manager, which means it can not change the group chat message properly. See the func group_level_after_work I write below.

Solutions:

1. At SwarmAgent level, add an interface for developers to update the agent's attribute once it is selected as the next speaker.
2. Is that possible to force the tool call while still replying on the agent to give the argument? Can we have a template for the tool calling message where the developer can customize the suggested tool call?
3. May build a reference of some groupchat attributes to the tool_execution agent.
4. Is that possible to add groupchat manager as an argument of the group level customized Afterwork function?

Additional context

Example: Emotional Swarm In this example, I build an emotional chatbot based on Swarm, which changes its emotion every time it receives a new message.

Particularly, the cognitive architecture of this chatbot is composed of three modules

  1. emotion_reaction_module: simulate human-like emotion change based on the user's input message, update the context variable 'assistant_emotion'
  2. emotion_estimation_module: estimate user's emotion then update the context varible 'user_emotion'
  3. Response module: based on assistant emotion, Swarm would switch to one of the common_response_module, desperate_response_module and happy_response_module based on assistant_emotion

There is a customized Afterwork function as a fallback handoff.

from autogen import ON_CONDITION, AfterWorkOption, SwarmAgent, initiate_swarm_chat, SwarmResult, AFTER_WORK, UserProxyAgent

import autogen
from autogen.agentchat.groupchat import GroupChat

config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": ["gpt-4o"],
    },
)

llm_config = {
    "cache_seed": 42,  # change the cache_seed for different trials
    "temperature": 1,
    "config_list": config_list,
    "timeout": 120,
    "tools": [],
}

context_variables = {'assistant_emotion': 0, 'user_emotion': 0}

def assistant_emotion_reaction(context_variables:dict, emotion_reaction_direction:str):
    """
    simulate the human emotion reaction to the user's latest message. emotion_reaction_direction: positive, negative, neutral
    """
    if emotion_reaction_direction == 'positive':    
        context_variables['assistant_emotion'] += 1 
    elif emotion_reaction_direction == 'negative':
        context_variables['assistant_emotion'] -= 1 
    elif emotion_reaction_direction == 'neutral':
        pass

    # switch to different response module based on assistant emotion
    if context_variables['assistant_emotion'] > 1:
        next_agent = happy_response_module
    elif context_variables['assistant_emotion'] < -1:
        next_agent = desperate_response_module
    else:
        next_agent = emotion_estimation_module

    return SwarmResult(values=f'assistant emotion reaction {emotion_reaction_direction}, the current assistant emotion level is {context_variables["assistant_emotion"]}', agent=next_agent, context_variables=context_variables)

emotion_reaction_module = SwarmAgent(
    name="Emotion_Reaction",
    system_message="You are an emotion module of an AI chatbot that simulates the human's emotion reaction to the latest message. Reaction direction: positive, negative, neutral",
    llm_config=llm_config,
    functions=[assistant_emotion_reaction],
)

emotion_reaction_module.register_hand_off( ON_CONDITION(emotion_reaction_module, 'Have not updated the emotion'))

def estimate_user_emotion_change(context_variables:dict, emotion_change_direction:str):
    """
    Estimate the user's emotion when seeing some positive input. emotion_change_direction: positive, negative, neutral
    """
    if emotion_change_direction == 'positive':
        context_variables['user_emotion'] += 1 
    elif emotion_change_direction == 'negative':
        context_variables['user_emotion'] -= 1 
    elif emotion_change_direction == 'neutral':
        pass
    return SwarmResult(values=f'User emotion change {emotion_change_direction}, the current user emotion level is {context_variables["user_emotion"]}', agent=common_response_module, context_variables=context_variables)

emotion_estimation_module = SwarmAgent(
    name="Emotion_Estimation",
    system_message="You are an emotion estimation module of an AI chatbot that estimates the user's emotion change direction based on the latest message. Emotion change direction: positive, negative, neutral",
    llm_config=llm_config,
    functions=[estimate_user_emotion_change],
)
emotion_estimation_module.register_hand_off( ON_CONDITION(emotion_estimation_module, 'Have not updated the emotion'))

def get_assistant_emotion_level(context_variables:dict):
    return context_variables['assistant_emotion']

def get_specific_context_variable(context_variables:dict, keys:list):
    """
    Get some specific context variable from the context variables dictionary.
    """
    returned_dict = {}
    for key in keys:
        returned_dict[key] = context_variables[key]
    return returned_dict

common_response_module = SwarmAgent(
    name="Common_Response_Module",
    system_message="You are an AI chatbot that generate a response to user's latest message based on your emotion level and user emotion level. J",
    llm_config=llm_config,
    functions=[get_assistant_emotion_level, get_specific_context_variable],
)

desperate_response_module = SwarmAgent(
    name="Desperate_Response_Module",
    system_message="You are desperate human that generate a response to user's latest message based in a very very very dramaticallydesperate tone.",
    llm_config=llm_config,
    functions=[get_assistant_emotion_level, get_specific_context_variable],
)

happy_response_module = SwarmAgent(
    name="Happy_Response_Module",
    system_message="You are a happy human that generate a response to user's latest message based in a very very very dramatically happy tone.",
    llm_config=llm_config,
    functions=[get_assistant_emotion_level, get_specific_context_variable],
)

common_response_module.register_hand_off(ON_CONDITION(emotion_reaction_module, "Every time receive a new user message, analyse the emotion of the newest user message"))
happy_response_module.register_hand_off(ON_CONDITION(emotion_reaction_module, "Every time receive a new user message, analyse the emotion of the newest user message"))
desperate_response_module.register_hand_off(ON_CONDITION(emotion_reaction_module, "Every time receive a new user message,analyse the emotion of the newest user message"))

user = UserProxyAgent(
    name="User",
    system_message="Human user",
    code_execution_config=False,
)

def group_level_after_work(last_speaker, messages, groupchat:GroupChat, context_variables):
    """
    The group level after work can further customize the agent transition behavior, when function calls do not return next agent.

    Can sometimes force the emotion module to do the tool call.
    """
    user_agent = groupchat.agent_by_name('User')
    previous_user_message = [message for message in messages if message['role'] == 'user'][-1]['content']
    if last_speaker.name == 'Emotion_Reaction':
        message = f'use a tool call to do emotion reaction to {previous_user_message}'
        user_agent.send(message, emotion_reaction_module, request_reply=False, silent=True) # FIXME: sender should be groupchat manager
        return emotion_reaction_module

    elif last_speaker.name == 'Emotion_Estimation':
        message = f'use a tool call to update the user emotion estimation based on the user message: {previous_user_message}'
        user_agent.send(message, emotion_estimation_module, request_reply=False, silent=True) # FIXME: sender should be groupchat manager
        return emotion_estimation_module

    # elif last_speaker.name in ['Common_Response_Module', 'Happy_Response_Module', 'Desperate_Response_Module']:
    #     return emotion_reaction_module

    return user_agent

if __name__ == '__main__':
    general_test_messags = "I don't like this AI chatbots. It doesn't have human emotions."
    tool_call_control_test_messages = "Say hi to me without any tool calls at the first round."

    chat_history, context_variables, last_agent = initiate_swarm_chat(
        initial_agent=emotion_reaction_module,
        agents=[emotion_reaction_module, emotion_estimation_module, common_response_module, happy_response_module, desperate_response_module],
        user_agent=user,
        messages=tool_call_control_test_messages,
        after_work=group_level_after_work,
        context_variables=context_variables,
        max_rounds=10,
    )

    print(chat_history)
linmou commented 1 day ago

Is it a proper way to save the context variables in swarm agents rather than tool_excution agent? For example, if each agent can save and retrieve the context variables in "context" field of messages, so that the context variables can be modified locally and spread globally, and then the first issue can be solved.