Handling runtime errors in code produced by an agent

maxim-saplin commented 11 months ago

What is the right way to make agents debug and fix the code they produce?

In my code I prompt an agent to write Python script.

Actual: The user proxy fails to take care of installing Python dependencies and halts when running the code.

Expected: User proxy agent get's error description of the failing code and either take cares or running pip install or asks the agent to start with preparation instructions.

Code

```python import autogen config_list = autogen.config_list_from_json( "config_list.json", file_location=".", filter_dict={ "model": { #"gpt35x16k", "gpt4x8k", } }) llm_config={ "request_timeout": 600, "seed": 42, "config_list": config_list, "temperature": 0, } assistant = autogen.AssistantAgent( name="CTO", llm_config=llm_config, system_message="Chief technical officer of a tech company" ) user_proxy = autogen.UserProxyAgent( name="smax", human_input_mode="TERMINATE", max_consecutive_auto_reply=10, is_termination_msg=lambda x: x.get ("content", "").rstrip() .endswith("Terminate"), code_execution_config={"work_dir":"_output"}, llm_config=llm_config, system_message=""""Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.""" ) task = """ 1. Write Python script that saves the contents of the article `https://dev.to/maximsaplin/efficient-dart-part-2-going-competitive-307c` to a .txt file 2. Then the script counts words and builds a histogram (words sorted by frequency), saves results to another .txt file 3. Check the produced histogram and try guessing what the article is about Use Python. If there're any dependencies required please specifically tell me how to get them installed. """ user_proxy.initiate_chat(assistant, message=task) ```

Output

smax (to CTO): 1. Write Python script that saves the contents of the article `https://dev.to/maximsaplin/efficient-dart-part-2-going-competitive-307c` to a .txt file 2. Then the script counts words and builds a histogram (words sorted by frequency), saves results to another .txt file 3. Check the produced histogram and try guessing what the article is about Use Python. If there're any dependencies required please specifically tell me how to get them installed. -------------------------------------------------------------------------------- CTO (to smax): Here is a Python script that does what you asked for. This script uses the `requests`, `BeautifulSoup`, `collections` and `matplotlib` libraries. If you don't have these libraries installed, you can install them using pip: ``` pip install requests beautifulsoup4 matplotlib ``` Here is the Python script: ```python import requests from bs4 import BeautifulSoup from collections import Counter import matplotlib.pyplot as plt import re # Step 1: Save the contents of the article to a .txt file url = 'https://dev.to/maximsaplin/efficient-dart-part-2-going-competitive-307c' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') article_text = soup.find('div', class_='crayons-article__content').get_text() with open('article.txt', 'w') as f: f.write(article_text) # Step 2: Count words and build a histogram, save results to another .txt file with open('article.txt', 'r') as f: words = re.findall(r'\b\w+\b', f.read().lower()) word_count = Counter(words) with open('histogram.txt', 'w') as f: for word, count in word_count.most_common(): f.write(f'{word}: {count}\n') # Step 3: Check the produced histogram and try guessing what the article is about # This step is subjective and depends on the person analyzing the histogram. # However, you can look at the most common words and their context to get an idea of the article's content. ``` This script first saves the article content to a file named `article.txt`. Then it reads this file, counts the occurrences of each word and saves this information to a file named `histogram.txt`. The words in the histogram are sorted by their frequency. Please note that this script does not remove stop words (common words like 'the', 'is', 'in', etc.) from the text before counting the words. If you want a more accurate histogram, you might want to remove these words before counting. -------------------------------------------------------------------------------- >>>>>>>> USING AUTO REPLY... >>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)... user@users-MacBook-Pro autogen_sample % cd /private/var/user/src/autogen_sample ; /usr/bin/env /usr/local/bin/python3 /private/var/ user/.vscode/extensions/ms-python.python-2023.18.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 53406 -- /private/v ar/user/src/autogen_sample/app.py smax (to CTO): 1. Write Python script that saves the contents of the article `https://dev.to/maximsaplin/efficient-dart-part-2-going-competitive-307c` to a .txt file 2. Then the script counts words and builds a histogram (words sorted by frequency), saves results to another .txt file 3. Check the produced histogram and try guessing what the article is about Use Python. If there're any dependencies required please specifically tell me how to get them installed. -------------------------------------------------------------------------------- CTO (to smax): Here is a Python script that does what you asked for. This script uses the `requests`, `BeautifulSoup`, `collections` and `matplotlib` libraries. If you don't have these libraries installed, you can install them using pip: ``` pip install requests beautifulsoup4 matplotlib ``` Here is the Python script: ```python import requests from bs4 import BeautifulSoup from collections import Counter import matplotlib.pyplot as plt import re # Step 1: Save the contents of the article to a .txt file url = 'https://dev.to/maximsaplin/efficient-dart-part-2-going-competitive-307c' response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') article_text = soup.find('div', class_='crayons-article__content').get_text() with open('article.txt', 'w') as f: f.write(article_text) # Step 2: Count words and build a histogram, save results to another .txt file with open('article.txt', 'r') as f: words = re.findall(r'\b\w+\b', f.read().lower()) word_count = Counter(words) with open('histogram.txt', 'w') as f: for word, count in word_count.most_common(): f.write(f'{word}: {count}\n') # Step 3: Check the produced histogram and try guessing what the article is about # This step is subjective and depends on the person analyzing the histogram. # However, you can look at the most common words and their context to get an idea of the article's content. ``` This script first saves the article content to a file named `article.txt`. Then it reads this file, counts the occurrences of each word and saves this information to a file named `histogram.txt`. The words in the histogram are sorted by their frequency. Please note that this script does not remove stop words (common words like 'the', 'is', 'in', etc.) from the text before counting the words. If you want a more accurate histogram, you might want to remove these words before counting. -------------------------------------------------------------------------------- >>>>>>>> USING AUTO REPLY... >>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is sh)...

sonichi commented 11 months ago

Could you try the default AssistantAgent with no system message? It will use a default system message which produces code blocks that can be consumed by the UserProxyAgent, including the pip install block.

maxim-saplin commented 11 months ago

Could you try the default AssistantAgent with no system message? It will use a default system message which produces code blocks that can be consumed by the UserProxyAgent, including the pip install block.

Same problem. This:

assistant = autogen.AssistantAgent(
    name="CTO",
    llm_config=llm_config,
    #system_message="Chief technical officer of a tech company"
)

Gives this:

maxim-saplin commented 11 months ago

Though it seems the problem might be with Docker sanboxing... I have Docker desktop running and pip3 install docker has also been done. Will investigate further

sonichi commented 11 months ago

The default docker image doesn't allow pip install. To allow for pip install, set use_docker="python:3" or other image name which supports it.

thinkall commented 11 months ago

The default docker image doesn't allow pip install. To allow for pip install, set use_docker="python:3" or other image name which supports it.

Why not set "python:3" as the default image?

sonichi commented 11 months ago

The default docker image doesn't allow pip install. To allow for pip install, set use_docker="python:3" or other image name which supports it.

Why not set "python:3" as the default image?

safer :)

maxim-saplin commented 11 months ago

The default docker image doesn't allow pip install. To allow for pip install, set use_docker="python:3" or other image name which supports it.

Thanks! It solved the issues for me! Though I spent like 20 minutes browsing docs and searching the exact place to put the value:)

I've created a PR with FAQ doc updated, seems like the problem is common: https://github.com/microsoft/autogen/pull/269

microsoft / autogen

Handling runtime errors in code produced by an agent #257