microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
32.42k stars 4.72k forks source link

[Bug]: Groq responses are dicts when strs are expected #3368

Closed gsteinLTU closed 2 months ago

gsteinLTU commented 2 months ago

Describe the bug

Using Groq in a ConversableAgent is not quite working. The responses from Groq models are returned as dicts, while many parts of the framework expect strs only. By itself, generate_reply still works (assuming you write code to handle this case), but for something like initiate_chats you run into issues (or even initiate_chat singular, which needs its termination condition to handle the different structure). (after reinstalling, 0.2.34 does detect when the carryover is a dict)

It seems like the expected behavior is to not require significant other changes if you switch which LLM you're using.

According to msze on Discord:

When the OpenAI response is returned, if there are no tool calls in the response it will just return the message content string (choice.message.content)

For Groq, it returns the whole chat message object and then dumps it to a dictionary. So Groq does not have a different return whether or not tools are called.

Steps to reproduce

Example code to demonstrate issue:

from autogen import ConversableAgent
import os

groq_api_key = os.environ.get("GROQ_API_KEY") or "YOUR_GROQ_API_KEY"

os.environ["GROQ_API_KEY"] = groq_api_key

llm_config_groq = {
    "api_key": groq_api_key,
    "api_type": "groq",
    "model": "llama3-70b-8192",
}

openai_api_key = os.environ.get("OPENAI_API_KEY") or "YOUR_OPENAI_API_KEY"

os.environ["OPENAI_API_KEY"] = openai_api_key

llm_config_openai = {
    "api_key":  openai_api_key,
    "api_type": "openai",
    "model": "gpt-3.5-turbo",
}

agent_groq = ConversableAgent(
    name="chatbot_groq", 
    llm_config=llm_config_groq, 
    human_input_mode="NEVER"
)

agent_openai = ConversableAgent(
    name="chatbot_openai", 
    llm_config=llm_config_openai, 
    human_input_mode="NEVER"
)

# Get the agents' responses
response_groq = agent_groq.generate_reply(messages=[{"role": "user", "content": "Hello!"}])
response_openai = agent_openai.generate_reply(messages=[{"role": "user", "content": "Hello!"}])

print("Groq:")
print(response_groq)
print(type(response_groq))
print("--------------------")
print("OpenAI:")
print(response_openai)
print(type(response_openai))

This will result in:

Groq:
{'content': "Hello! It's nice to meet you. I'm here to help with any questions, tasks, or topics you'd like to discuss. How can I assist you today? Do you have something specific on your mind, or would you like some suggestions on where to start?", 'role': 'assistant', 'function_call': None, 'tool_calls': None}
<class 'dict'>
--------------------
OpenAI:
Hello! How can I assist you today?
<class 'str'>

Model Used

Any Groq model

Expected Behavior

The Groq models should return outputs that are compatible with the rest of AutoGen, whether the Groq models return strings now, or the other parts expecting strs are modified to handle a dict with a content field.

Screenshots and logs

No response

Additional Information

AutoGen Version: 0.2.34 Python Version: 3.12

marklysze commented 2 months ago

Thanks @gsteinLTU, for clarity on what is needed - the response should just be a string if there are no tools, is that right? I think we should also make sure the response is the same if tools are returned.

gsteinLTU commented 2 months ago

I think that would be right, it turns out initiate_chats is handling it correctly in the newest version, so this might not be so necessary (it's just a minor annoyance).

marklysze commented 2 months ago

generate_reply

Thanks for noting that initiate_chat(s) are performing as expected now, do you use generate_reply? If so, we can continue to look into changing it for that for consistency.

gsteinLTU commented 2 months ago

Unfortunately, we decided not to go with autogen for our project, but personally, I'm not 100% sure what makes sense.

The best solution might be simply to make the documentation be more descriptive about the return types (it briefly explains when a None is returned, but not when str or dict are). IMO it should be consistent between models in either direction, likely making both models output a str when there's only content in the dict so it's consistent with OpenAI's models.

marklysze commented 2 months ago

Thanks for the update @gsteinLTU, agreed that for consistency it should align with the string return in that case. For further changes to the client classes I'll make a note to update the return type as well. Appreciated and I'll close this issue for now (feel free to open again, if needed).