Auto-GPT makes invalid code edits, replacing code files, creating duplicate test files etc

davefellows commented 1 year ago

Duplicates

[X] I have searched the existing issues

Steps to reproduce 🕹

Run with the following prompts:

python scripts/main.py --gpt3only GPT3.5 Only Mode: ENABLED Welcome back! Would you like me to return to being A high skilled software engineer? Continue with the last settings? Y Name: A high skilled software engineer Role: A software engineer that will develop a python module with various functions for working with circles, for example, using math.pi to calculate a circle's circumference and other useful functions. The python module should be created in circle.py file Goals: ['Populate circle.py file with functions to make it easy to work with circles', 'Ensure circle.py is a complete and working python file', 'Create unit tests for the various functions ensuring good code coverage', 'build and make sure tests run successfully', 'Be sure to write the code to the circle.py file in the current directory']

Current behavior 😯

Correctly added functions to circle.py initially but then made an erroneous edit replacing all the contents:

Initially created circles_tests.py in the root folder but then got confused when trying to look for a tests file in /tests. Ended up creating two additional test files:

One of the test files is invalid (and the editing/layout is a bit off):

There was also an erroneous append to one of the test files (similar to with circles.py) but I don't have this edit in the current state of my files.

I killed the process after some time as it was clear it was struggling.

Only other observation is the process took a long time. Seems some of the iterations are redundant and it takes one or more iterations just to add a single function.

Expected behavior 🤔

Code edits and appends should be valid. In this case, a single test file should have been created, not three.

Your prompt 📝

No response

Pwuts commented 1 year ago

This is a problem with the LLM (GPT-3.5), for which Auto-GPT is just a wrapper.

davefellows commented 1 year ago

The erroneous code edits (e.g. where it mistakenly replaces existing code in an invalid way when it's inserting a new function) are caused by GPT-3.5 ? That's kind of surprising...

Significant-Gravitas / AutoGPT