crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
20.41k stars 2.82k forks source link

Unable to properly run PythonREPL as code interpreter #478

Closed sunxingshu closed 2 months ago

sunxingshu commented 6 months ago

Hey all

I am doing a small exercise to ask crewai to write and test python code based on users' input.

However, it seems crewai is having trouble to execute the python code generated by LLM.

I am using Gemini Pro using ChatVertexAI from langchain_google_vertexai.

Can anyone help me?

Code:

from textwrap import dedent
from crewai import Agent, Task, Crew
from langchain.tools import tool
from langchain_experimental.utilities import PythonREPL
from dotenv import load_dotenv
from typing import Annotated

# Initialize the Python REPL tool
repl = PythonREPL()

@tool
def pythontool(
    code: Annotated[str, "The python code to execute to generate your chart."]
):
    """Use this to execute python code. If you want to see the output of a value,
    you should print it out with `print(...)`. This is visible to the user."""
    try:
        result = repl.run(code)
    except BaseException as e:
        return f"Failed to execute. Error: {repr(e)}"
    return f"Succesfully executed:\n```python\n{code}\n```\nStdout: {result}"

requirement_description = 'Do a scatter plot of randomly generated numbers'

developer_agent = Agent(
            role='Software Developer',
            goal='Develop Python scripts based on user requirements.',
            backstory=dedent("""
                A skilled Software Developer proficient in Python and experienced in developing dynamic applications."""),
            tools=[pythontool],
            allow_delegation=False,
            llm=llm,
            verbose=True
        )

tester_agent = Agent(
            role='Software Tester',
            goal='Test Python scripts to ensure it functions according to requirements.',
            backstory=dedent("""
                A detail-oriented Software Tester focused on identifying and resolving bugs, ensuring software quality."""),
            tools=[pythontool],
            #allow_delegation=True,
            llm=llm, 
            verbose=True
        )

development_task = Task(
            description=(f"""
                Develop Python codes based on the following requirements:
                {requirement_description}
                Use the 'pythontool' to execute and test the codes."""),
            expected_output='Your Final answer must be the full python code, only the python code and nothing else.',
            agent=developer_agent,  
            tools=[pythontool],
            verbose=True
        )

testing_task = Task(
            description=(f"""
                Test Python codes produced by the Software Developer agent to ensure it meets the given specifications.
                {requirement_description}
                Use the 'pythontool' to execute and test the codes."""),
            expected_output='Your Final answer must be the full python code, only the python code and nothing else.',
            agent=tester_agent,  
            tools=[pythontool],
            verbose=True
        )

print("## Welcome to the Software Development Crew")
print('-------------------------------------------')

# Initialize Crew
development_crew = Crew(
    agents=[
        developer_agent,
        tester_agent
    ],
    tasks=[
        development_task,
        testing_task
    ],
    verbose=2
)

# Kick off the development process
development_results = development_crew.kickoff()

# Output the results
print("\n\n########################")
print("## Development Results")
print("########################\n")
print("Detailed results of the development process:")
print(development_results)

Output:

## Welcome to the Software Development Crew
-------------------------------------------
 [DEBUG]: == Working Agent: Software Developer
 [INFO]: == Starting Task: 
                Develop Python codes based on the following requirements:
                Do a scatter plot of randomly generated numbers
                Use the 'pythontool' to execute and test the codes.

> Entering new CrewAgentExecutor chain...
Thought: I need to generate Python code to create a scatter plot of randomly generated numbers.

Action: pythontool

Action Input: 
python
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
x = np.random.rand(100)
y = np.random.rand(100)

# Create scatter plot
plt.scatter(x, y)

# Show plot
plt.show()

Thought: I need to fix the Action Input to be a valid key, value dictionary.

Action: pythontool

Action Input: 
python
{
"code": "import matplotlib.pyplot as plt\nimport numpy as np\n\n# Generate random data\nx = np.random.rand(100)\ny = np.random.rand(100)\n\n# Create scatter plot\nplt.scatter(x, y)\n\n# Show plot\nplt.show()"
}

Thought: I need to fix the Action Input to be a valid key, value dictionary.

Action: pythontool

Action Input: 
python
{
"code": "import matplotlib.pyplot as plt\nimport numpy as np\n\n# Generate random data\nx = np.random.rand(100)\ny = np.random.rand(100)\n\n# Create scatter plot\nplt.scatter(x, y)\n\n# Show plot\nplt.show()"
}

Thought: I need to fix the Action Input to be a valid key, value dictionary.

Action: pythontool

Action Input: 
python
{
"code": "import matplotlib.pyplot as plt\nimport numpy as np\n\n# Generate random data\nx = np.random.rand(100)\ny = np.random.rand(100)\n\n# Create scatter plot\nplt.scatter(x, y)\n\n# Show plot\nplt.show()"
}

Thought: I need to fix the Action Input to be a valid key, value dictionary.

Moving forward, it just keeps reporting "Thought: I need to provide a valid key, value dictionary as input to the pythontool."

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 2 months ago

This issue was closed because it has been stalled for 5 days with no activity.