microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
30.73k stars 4.48k forks source link

Bug Report: MultimodalConversableAgent within GroupChat #628

Closed BeibinLi closed 9 months ago

BeibinLi commented 10 months ago

A simple example is shown below:

agent1 = MultimodalConversableAgent(
    name="image-explainer-1",
    max_consecutive_auto_reply=10,
    llm_config={"config_list": config_list_gpt4v, "temperature": 0.5, "max_tokens": 300},
    system_message="Your image description is poetic and engaging.",
)
agent2 = MultimodalConversableAgent(
    name="image-explainer-2",
    max_consecutive_auto_reply=10,
    llm_config={"config_list": config_list_gpt4v, "temperature": 0.5, "max_tokens": 300},
    system_message="Your image description is factual and to the point.",
)

user_proxy = autogen.UserProxyAgent(
    name="User_proxy",
    system_message="Ask both image explainer 1 and 2 for their description.",
    human_input_mode="NEVER",  # Try between ALWAYS or NEVER
    max_consecutive_auto_reply=10,
)

# We set max_round to 5
groupchat = autogen.GroupChat(agents=[agent1, agent2, user_proxy], messages=[], max_round=5)
group_chat_manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_llm_config)

user_proxy.initiate_chat(group_chat_manager,
                        message=f"""What do you see?
                        <img https://th.bing.com/th/id/R.422068ce8af4e15b0634fe2540adea7a?rik=y4OcXBE%2fqutDOw&pid=ImgRaw&r=0>.""")

The issue is caused by the variable type of system_message, which is different in Multimodal and other conversable agents.

BeibinLi commented 10 months ago

This issue is related to https://github.com/microsoft/autogen/issues/658 We will come back to this issue after fixing #658

joshkyh commented 9 months ago

Let's close this after the pytest (https://github.com/microsoft/autogen/pull/713#discussion_r1399832407) is implemented..?