Closed abdullamatar closed 5 months ago
@thinkall
Ok, there are a couple of recommendations I'd like to make:
Give all your agents descriptions -- it will make GroupChat less likely to fail. See here for more details: https://microsoft.github.io/autogen/blog/2023/12/29/AgentDescriptions
You should try the following in your groupchat config: allow_repeat_speaker=False
If your speakers are in a fixed order try also: speaker_selection_method="round_robin"
,
Also, if you enable code execution, be sure to set:
code_execution_config={
"work_dir": work_dir,
"last_n_messages": "auto",
},
But I see that you have that commented out above.
Thank you for your prompt responses, I'll implement the changes you've recommended, review my overall logic, and get back to you asap. Peace.
I've made some progress adopting your suggested changes and revisiting the Advanced Usage of RAG Agents tutorial, which had some slight updates since I first looked at it. Changes in terms of wrapping the RAG Agent in a function call vs calling it explicitly or adding it as part of the group chat. Which is something I'm not understanding. Namely, using the RAG agent, attaching an LLM to it and allowing it to be part of the groupchat vs allowing the groupchat agents to trigger the function call associated with the RAG agent? I've solved the issue of the main_userproxy
responding with nothing and that is because there was no associated llm_config
passed to its initialization... I've also switched to using the experimental RetrieveAssistantAgent
as one of my agents (without fully understanding the implications just going based off of how it was used in the beginning of the RAG tutorial). Here is my updated code:
def create_research_team() -> List[ConversableAgent]:
agent0 = UserProxyAgent(
name="main_userproxy",
human_input_mode="NEVER",
code_execution_config=False,
description="Your role is to coordinate the completion of tasks related to generating code based off of machine learning and AI research. You must be diligent and operate in a step by step manner, make use of all the agents at your disposal.",
llm_config=base_cfg,
)
retriever = EmbeddingRetrieverAgent(
name="info_hoarder",
human_input_mode="NEVER",
description="A retrieval augmented agent whose role is to retrieve additional information when asked, you can access an embeddings database with information related to code and research papers.",
code_execution_config=False,
# llm_config=base_cfg,
retrieve_config={
"task": "qa",
},
# max_consecutive_auto_reply=4,
)
agent2 = RetrieveAssistantAgent(
name="code_reviewer",
description="Agent used to review code, given the information retrieved by the retrieval agent and other information related to the main problem at hand. Review the code generated by the coding_agent to make sure it is executable and logically follows the ideas from the research and source code.",
llm_config=retrieve_conf,
is_termination_msg=termination_msg,
code_execution_config=False,
)
agent3 = AssistantAgent(
name="coding_agent",
description="A coding agent that is tasked with iteratively generating code based off of the information provided by the retrieval agent and the code designer agent.",
code_execution_config={"work_dir": "./sandbox", "use_docker": False},
# function_map={
# "execute_and_save": execute_and_save,
# },
is_termination_msg=termination_msg,
llm_config=retrieve_conf,
)
return [agent0, retriever, agent2, agent3]
# rc: https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/
def _reset_agents(agents: List[ConversableAgent]) -> None:
[agent.reset() for agent in agents]
def init_rag_gc(problem) -> None:
agent0, retriever, agent2, agent3 = create_research_team()
_reset_agents([agent0, retriever, agent2, agent3])
# del agents
groupchat = GroupChat(
agents=[agent0, agent2, agent3],
messages=[],
max_round=44,
allow_repeat_speaker=False,
speaker_selection_method="auto",
)
def retrieve_content(
message, n_results=7, retriever: EmbeddingRetrieverAgent = retriever
):
retriever.n_results = n_results # Set the number of results to be retrieved.
# Check if we need to update the context.
update_context_case1, update_context_case2 = retriever._check_update_context(
message
)
if (update_context_case1 or update_context_case2) and retriever.update_context:
retriever.problem = (
message if not hasattr(retriever, "problem") else retriever.problem
)
_, ret_msg = retriever._generate_retrieve_user_reply(message)
else:
ret_msg = retriever.generate_init_message(message, n_results=n_results)
return ret_msg if ret_msg else message
for agent in [agent0, agent2, agent3]:
# register functions for all agents.
agent.register_function(
function_map={
"retrieve_content": retrieve_content,
}
)
manager = GroupChatManager(groupchat=groupchat, llm_config=retrieve_conf)
agent0.initiate_chat(
manager,
message=problem,
)
Here is the retrieve_conf
used:
retrieve_conf = {
**base_cfg,
"functions": [
{
"name": "retrieve_content",
"description": "retrieve content for code generation and question answering.",
"parameters": {
"type": "object",
"properties": {
"message": {
"type": "string",
"description": "Refined message which keeps the original meaning and can be used to retrieve content for code generation and question answering.",
}
},
"required": ["message"],
},
},
],
"timeout": 60,
"seed": 42,
}
Thank you for help, I appreciate it and it definitely allowed me to move forward. There are still somethings I am not quite getting like how passing around the instance of my subclassed RetrieveUserProxyAgent
(the retriever
variable in this case) should be done correctly, and how I can have more control over when and how often the RAG agent gets triggered. Also the retriever
argument is not included as part of the parameters to the retrieve_conf
but it still behaves normally.
As a side note, from the Advanced Usage of RAG Agents tutorial, the llm_config
passed to the group chat manager has a function call in it which is not allowed, and I believe it is an error:
ValueError: GroupChatManager is not allowed to make function/tool calls. Please remove the 'functions' or 'tools' config in 'llm_config' you passed in
init_rag_gc()
for your refrence:
main_userproxy (to chat_manager):
I want to understand the agent tuning paper and come out with a minimal implementation of some of the core ideas in the paper the code must be executable.
coding_agent (to chat_manager):
Suggested function Call: retrieve_content Arguments: {"message":"agent tuning paper"}
EXECUTING FUNCTION retrieve_content... Adding doc_id 0 to context. main_userproxy (to chat_manager):
Response from calling function "retrieve_content"
You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly UPDATE CONTEXT
.
You must give as short an answer as possible.
User's question is: agent tuning paper
Context is: To fundamentally enable generalized agent abilities for LLMs, we introduce a simple and gen- eral approach AgentTuning as shown in Figure 2. AgentTuning consists of two components: a lightweight instruct-tuning dataset AgentInstruct and a hybrid instruction-tuning strategy that en- hances the agent’s capabilities while preserving its generalization ability. As shown in Table 1, AgentInstruct covers 1,866 verified interaction trajectories with high-quality Chain-of-Thought (CoT) rationale (Wei et al., 2022b) for each decision step from six diverse agent tasks. For each agent task, one interaction trajectory is collected through three phases: instruction construction, tra- jectory interaction by employing GPT-4 as the agent, and trajectory filtering depending on its reward score. To enhance LLMs’ agent capabilities while preserving their general abilities, we experiment with a hybrid instruction-tuning strategy. The idea is to mix AgentInstruct with high-quality and general data at a
coding_agent (to chat_manager):
Based on the context about "Agent Tuning," we understand that it is a methodology designed to enhance the capabilities of Large Language Models (LLMs) like GPT-4 for agent tasks while maintaining their general abilities. The two main components of AgentTuning are:
AgentInstruct: A lightweight instruct-tuning dataset that includes 1,866 verified interaction trajectories with high-quality Chain-of-Thought rationales for each decision step across six diverse agent tasks.
Hybrid Instruction-Tuning Strategy: A method that combines AgentInstruct with high-quality, general data to improve LLMs' agent abilities without compromising their broad applicability.
To create a minimal implementation related to some of the core ideas presented in this paper, we can focus on:
As a first step, I will provide you with Python code that sets up a task selection and interacts with a text-based LLM. Are you ready to proceed?
main_userproxy (to chat_manager):
Absolutely, let's proceed with a minimal example. However, please keep in mind that as an AI developed by OpenAI, I cannot execute code directly or access external APIs, including interacting directly with GPT-4 outside of OpenAI's user interface. Instead, I will provide you with an illustrative Python code snippet that you can execute in your local environment.
This Python code will simulate the core idea of task selection and generating an instruction given an interaction trajectory:
import random
# Define a set of mock agent tasks
agent_tasks = {
"translation": ["Translate the following sentence into French: 'Hello, how are you?'"],
"summarization": ["Summarize this article: 'The history of AI is one...'"],
"math": ["What is the sum of 17 and 43?"],
"conversational": ["How would you comfort someone who is feeling sad?"],
}
# Define a simple reward function (just for illustration purposes)
def reward_function(interaction):
# A mock reward function that randomly assigns a score
return random.uniform(0, 1)
# Task selection mechanism
selected_task = random.choice(list(agent_tasks.keys()))
instruction = agent_tasks[selected_task][0]
# Interaction trajectory (mock interaction with AI since we can't call GPT-4)
agent_response = f"Mock response to: {instruction}"
# Evaluate interaction
reward = reward_function(agent_response)
print(f"Selected Task: {selected_task}")
print(f"Instruction: {instruction}")
print(f"Agent Response: {agent_response}")
print(f"Reward Score: {reward}")
In the snippet above, we have:
reward_function
which, in a real scenario, would be more complex and based on specific criteria related to the agent's performance.Please note that to actually interface with GPT-4, you would need to use an API client like OpenAI's 'openai' Python package, and the agent_response
would be obtained by querying the model with the instruction.
Lastly, for an actual implementation, task selection would likely consider more factors, including user input, past performance data, and relevance. Moreover, the reward function would be based on precise success metrics.
To execute this code, you will need a Python environment. Simply copy the code into a .py
file and run it to see the random task selection and associated instructions in play.
code_reviewer (to chat_manager):
It seems there was confusion regarding my previous message. As an AI assistant, I cannot provide you with the actual output of the code, but I can provide the Python code that encapsulates the ideas you've described.
We can create a minimal example in Python that might look like this:
Define a lightweight instruction dataset similar to the AgentInstruct dataset, which will simply be a list of tasks with instructions and expected outcomes.
Create a function to randomly select an agent task from this dataset.
Create a mock interaction with an LLM, in this case, a function that pretends to query GPT-4 for a response based on the instruction.
Define a basic reward function that evaluates the agent response against the expected outcome.
Here's what the Python code could look like:
# filename: agent_tuning_example.py
import random
# Define a simple mock AgentInstruct dataset
agent_instruct = [
{
'task': 'translation',
'instruction': 'Translate "Hello, how are you?" to French.',
'expected_response': 'Bonjour, comment ça va?'
},
{
'task': 'math',
'instruction': 'Calculate the product of 9 and 7.',
'expected_response': '63'
},
# ... More tasks with instructions and expected responses
]
def select_task(dataset):
"""Randomly select a task from the dataset."""
return random.choice(dataset)
def mock_agent_interaction(instruction):
"""Mock function to simulate interaction with an LLM like GPT-4."""
# This is where the actual API call to GPT-4 would happen.
# The response is mocked for the purpose of this example.
# In an actual implementation, you would use OpenAI's API here.
return 'Mocked response based on instruction'
def reward_function(response, expected_response):
"""Evaluate the agent response and return a reward score."""
# In an actual scenario, this would involve a sophisticated metric.
# Here it is simplified to a binary reward for illustration.
return 1 if response == expected_response else 0
# Agent Tuning process
selected_task = select_task(agent_instruct)
agent_response = mock_agent_interaction(selected_task['instruction'])
reward = reward_function(agent_response, selected_task['expected_response'])
print(f"Task: {selected_task['task']}")
print(f"Instruction: {selected_task['instruction']}")
print(f"Agent Response: {agent_response}")
print(f"Reward: {reward}")
Please copy this code into a file called agent_tuning_example.py
and execute it in a Python environment.
Keep in mind that this example omits many details for brevity, such as detailed reward calculations, extensive datasets, and actual interaction with GPT-4 or other models. It's meant to illustrate the conceptual framework of agent tuning as described in the paper.
You can replace the mocked agent response with a call to the API of an appropriate model when you integrate it into a real-world application.
Please go ahead and execute the code, and let me know the result.
main_userproxy (to chat_manager):
As the AI developed by OpenAI, I cannot run or test the code you write, but I can definitely help you with a code review.
Your provided Python code outlines a simplified framework capturing the essence of the Agent Tuning paper. Here are several points of review and considerations:
Comments and Docstrings: Include more detailed docstrings and comments. Even though the code is simple, proper documentation is essential, especially if this were to be expanded or used as a reference.
Mock Functions: The mock_agent_interaction
is an excellent placeholder for future API integration. Consider adding a note where the API call would go, along with any necessary placeholder input that would be required.
Reward Function: The reward_function
is a binary evaluation metric, which is a good starting point. Depending on the complexity of tasks and the potential variation in the responses, you might want to implement a more nuanced metric eventually.
Error Handling: Consider adding error handling for various parts of your code that might fail in a real-world scenario, such as API call failures.
Testing: While the code cannot be executed in this environment, setting up some form of automated testing is always a good idea to ensure your code behaves as expected.
Expandability: Think about how you could scale this framework. In its current form, adding additional tasks and handling more complex instructions could become cumbersome. Look into making it more modular and scalable.
Real API integration: When you're ready to implement this with an actual LLM, you'll need to replace the mock_agent_interaction
with real interactions. Make sure to handle API rate limits, timeouts, and other operational considerations.
Cleaning up the output: If this were to be run as part of a service or a larger application, you would likely want to capture the output in log files or a more user-friendly format rather than just printing to the console.
By addressing these points, your code will be much more robust and ready to be part of a real-world implementation of the Agent Tuning framework. Remember to test the code in your local environment to ensure its functionality.
code_reviewer (to chat_manager):
Thank you for your feedback. Since you are unable to execute the code, the next steps would be for you to execute the code in your own environment and provide me with any output or errors. However, I understand you are not able to do this.
At this point, if there are no further instructions or queries regarding the Agent Tuning paper's implementation or any other topic, we can conclude our interaction.
If you have any other questions or need further assistance with a different topic, feel free to let me know! Otherwise, if everything is clear, kindly let me know, and I will terminate the conversation.
main_userproxy (to chat_manager):
You're correct, I'm unable to execute code. Yet, I'm still here to help with any further questions or guidance you may need. If you're set with the current information and have no additional queries, you can proceed with the implementation in your environment. If anything arises during your development process, or if there's anything else I can help you with, just reach out.
Best of luck with your implementation of the Agent Tuning concepts! If we're done for now, then I'm ready to conclude our interaction. Have a great day!
code_reviewer (to chat_manager):
It seems we have reached the conclusion of our current interaction. If you ever need further assistance, whether it's about implementing AI concepts, understanding papers, or tackling coding challenges, feel free to reach out.
Good luck with your project, and thank you for using OpenAI. Have a wonderful day!
TERMINATE
main_userproxy (to chat_manager):
Thank you. Goodbye, and take care!
(Conversation terminated.)
code_reviewer (to chat_manager):
TERMINATE
Describe the issue
I subclassed
RetrieveUserProxyAgent
in order to create some custom RAG behavior using my own embeddings, embedding function, etc. When I run the group chat it doesn't behave as I expect it, I expect the mainUserProxyAgent
to pass the problem to theRetrieveUserProxyAgent
who will then respond with relevant docs from the embedding db along with a response, and then rinse and repeat. However, that is not the case sometimes theRetrieveUserProxyAgent
isn't event called. I am sure this is due to some oversight or misconfiguration from my end.Steps to reproduce
There isn't really a bug or error occurring per se, I will just detail the main parts of the code I am working with, here is my
RetrieveUserProxyAgent
subclass:I am slightly confused between the use of the
query_texts
andsearch_string
params and am assuming thesearch_string
param is not required. Other than that theretrieve_docs
functionality works perfectly and gets the relevant docs as expected.Screenshots and logs
Here is a basic example of my
EmbeddingRetrieverAgent
interacting with anAssistantAgent
:Output from running the above code:
Adding doc_id 0 to context. infoman (to asst):
You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the context provided by the user. You should follow the following steps to answer a question: Step 1, you estimate the user's intent based on the question and context. The intent can be a code generation task or a question answering task. Step 2, you reply based on the intent. If you can't answer the question with or without the current context, you should reply exactly
UPDATE CONTEXT
. If user's intent is code generation, you must obey the following rules: Rule 1. You MUST NOT install any packages because all the packages needed are already installed. Rule 2. You must follow the formats below to write your code:If user's intent is question answering, you must give as short an answer as possible.
User's question is: I want to understand the agent tuning paper and come out with a minimal implementation of some of the core ideas in the paper the code must be executable.
Context is: To fundamentally enable generalized agent abilities for LLMs, we introduce a simple and gen- eral approach AgentTuning as shown in Figure 2. AgentTuning consists of two components: a lightweight instruct-tuning dataset AgentInstruct and a hybrid instruction-tuning strategy that en- hances the agent’s capabilities while preserving its generalization ability. As shown in Table 1, AgentInstruct covers 1,866 verified interaction trajectories with high-quality Chain-of-Thought (CoT) rationale (Wei et al., 2022b) for each decision step from six diverse agent tasks. For each agent task, one interaction trajectory is collected through three phases: instruction construction, tra- jectory interaction by employing GPT-4 as the agent, and trajectory filtering depending on its reward score. To enhance LLMs’ agent capabilities while preserving their general abilities, we experiment with a hybrid instruction-tuning strategy. The idea is to mix AgentInstruct with high-quality and general data at a
asst (to infoman):
Step 1: The user's intent is a code generation task. They want to implement core ideas from a paper on agent tuning, specifically a minimal executable code based on the described AgentTuning approach.
Step 2: To provide a minimal implementation of the core ideas described in the paper, we would need to create a dataset similar to AgentInstruct and implement a hybrid instruction-tuning strategy. However, creating a dataset and a full implementation would be beyond the scope of this platform. Instead, I can provide a simplified example of how one might structure code to simulate the described approach.
The following Python code is a conceptual example that simulates the process of instruction construction, trajectory interaction, and trajectory filtering. It does not implement the actual machine learning or natural language processing algorithms described in the paper, as those would require a much more complex setup and are not feasible to implement here.
This code is a highly simplified representation and does not perform any real instruction tuning or interaction with a language model. It is meant to illustrate the concept of selecting instructions, interacting with an agent, and filtering based on rewards, as described in the context.
infoman (to asst):
Proceed with refining it
asst (to infoman):
UPDATE CONTEXT
infoman (to asst):
TERMINATE
Here is another example run with the same code:
Adding doc_id 0 to context. infoman (to asst):
You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the context provided by the user. You should follow the following steps to answer a question: Step 1, you estimate the user's intent based on the question and context. The intent can be a code generation task or a question answering task. Step 2, you reply based on the intent. If you can't answer the question with or without the current context, you should reply exactly
UPDATE CONTEXT
. If user's intent is code generation, you must obey the following rules: Rule 1. You MUST NOT install any packages because all the packages needed are already installed. Rule 2. You must follow the formats below to write your code:If user's intent is question answering, you must give as short an answer as possible.
User's question is: I want to understand the agent tuning paper and come out with a minimal implementation of some of the core ideas in the paper the code must be executable.
Context is: To fundamentally enable generalized agent abilities for LLMs, we introduce a simple and gen- eral approach AgentTuning as shown in Figure 2. AgentTuning consists of two components: a lightweight instruct-tuning dataset AgentInstruct and a hybrid instruction-tuning strategy that en- hances the agent’s capabilities while preserving its generalization ability. As shown in Table 1, AgentInstruct covers 1,866 verified interaction trajectories with high-quality Chain-of-Thought (CoT) rationale (Wei et al., 2022b) for each decision step from six diverse agent tasks. For each agent task, one interaction trajectory is collected through three phases: instruction construction, tra- jectory interaction by employing GPT-4 as the agent, and trajectory filtering depending on its reward score. To enhance LLMs’ agent capabilities while preserving their general abilities, we experiment with a hybrid instruction-tuning strategy. The idea is to mix AgentInstruct with high-quality and general data at a
asst (to infoman):
Step 1: The user's intent is to generate code for a minimal implementation of the core ideas from the "agent tuning" paper mentioned. This involves a code generation task that requires understanding the user's description of the paper's approach and translating it into executable code.
Step 2: To provide a minimal executable implementation based on the provided context, I would need to code a simple version of the AgentInstruct dataset creation and hybrid instruction-tuning strategy. Given that there's no specific information about the dataset structure or the tuning strategy's implementation details, I'll outline pseudo-code for the described process. However, for a functional code, more specific details from the paper would be required.
The first step is to create a dataset with interaction trajectories, which seems to involve some sort of instruction building, a GPT-4 agent interaction, and trajectory filtering based on rewards. Next, the paper mentions a hybrid tuning strategy, which likely involves mixing the specific AgentInstruct dataset with a more general dataset and then performing some form of training with this hybrid dataset.
Here is a high-level pseudo-code outline based on the context provided:
Keep in mind that this pseudo-code is heavily based on the provided context and lacks specific implementation details, which would be needed to make it executable. If you can provide more details or access to the paper, I can give a more accurate implementation.
infoman (to asst):
This is a good start go ahead and retrieve further information, you have some code embeddings from the agent tuning repository, use them to refine the generated python code.
asst (to infoman):
UPDATE CONTEXT
infoman (to asst):
please proceed
asst (to infoman):
To proceed with refining the generated Python code, I need access to the code embeddings from the Agent Tuning repository as mentioned. However, as an AI developed by OpenAI, I don't have the ability to access external resources such as repositories or databases directly. My functionality is limited to providing information and generating code based on the details you provide.
To move forward, you would need to provide the code embeddings or any specific details from the repository that you want to include in the code. Without these details, I can't refine the code further. Please share the relevant information or code snippets that you want to be incorporated into the Python code.
Moving on to the main issue I am facing, which is working in a RAG group chat. I hope I was able to detail the current state of my logic clearly and apologize for the somewhat lengthy issue. Here is the main code for initializing the RAG groupchat I am working with:
docs_path is not provided in retrieve_config. Will raise ValueError if the collection
autogen-docs
doesn't exist. Set docs_path to None to suppress this warning. Adding doc_id 0 to context. info_hoarder (to chat_manager):You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the context provided by the user. If you can't answer the question with or without the current context, you should reply exactly
UPDATE CONTEXT
. You must give as short an answer as possible.User's question is: Summarize the agent tuning paper for me and reproduce a minimal example from the huggingfave mmlu evaluation file.
Context is: We present AgentTuning to achieve this goal, the first step of which is to build the AgentInstruct dataset that is used in the second step to instruction tune the LLMs. We carefully experiment and design these two steps such that the LLMs obtain good performance in (unseen) generalized agent task types while remaining good in general LLM tasks.
1The evaluation process of Mind2Web follows the teacher forcing method, decomposing the complete in-
teraction trajectory into multiple single-step. As a result, the real trajectory length is always 1.
3
Ratio
35.2% 23.6% 0.52% 13.0% 30.1% 16.6% 6.79%
5.29%
Preprint
2.1 CONSTRUCTING AGENTINSTRUCT
code_designer (to chat_manager):
AgentTuning uses AgentInstruct dataset for instruct tuning LLMs, improving task performance while maintaining LLM capability.
Minimal example from Hugging Face's MMLU evaluation not provided in context. UPDATE CONTEXT
main_userproxy (to chat_manager):
Additional Information
python -c "import autogen; print(autogen.__version__)"
: 0.2.2agent_conf.py
file here incase there is an issue or something I am missing there:from autogen import config_list_from_models
config_list = [ { "model": "gpt-4-1106-preview", "api_key": os.environ.get("OPENAI_APIKEY"), }, { "model": "gpt-3.5-turbo", "api_key": os.environ.get("OPENAI_APIKEY"), }, { "model": "gpt-3.5-turbo-16k", "api_key": os.environ.get("OPENAI_APIKEY"), }, ]
base_cfg = {
"use_cache": False,
}
exec_py_conf = { **base_cfg, "functions": [ { "name": "exec_py", "description": "Execute generated python code", "parameters": { "type": "object", "properties": { "code": { "type": "string", "description": "Python str to be executed.", } }, "required": ["code"], }, } ], }
write_file_config = { **base_cfg, "functions": [ { "name": "write_file", "description": "Save accepted python code to file", "parameters": { "type": "object", "properties": { "fname": { "type": "string", "description": "The name of the file to write", }, "content": { "type": "string", "description": "The content of the file to write", }, }, "required": ["fname", "content"], }, } ], }