Closed irgolic closed 1 year ago
Just wanted to add on my fail state as I don't yet have access to 4.0. I switched to gpt-3.5-turbo
and received an error about my monorepo:
I think in my case, given the complexity of the repository I'm trying to analyze, I'll just need to wait for 4.0... I'm asking it to specifically drill down into a sub-directory of the repo, target a specific version of a framework, and try to uncover missing/incomplete code.
My guess is the overarching request was too verbose (targeting an entire project and not, say, one specific controller or action) which in turn correlates to too many tokens.
I'm confident AutoPR can be optimized to work well with
gpt-3.5-turbo
.Background
As it stands, the
rail-v1
pull request agent works fine with gpt-3.5-turbo.However, the codegen agent does not.
rail-v1
codegen may work, but suffers from the same problems that it does when used bygpt-4
– it produces an unreliable diff with ambiguous errors.My hope is to use
auto-v1
codegen, which creates unambiguous changes.Action selection
gpt-3.5-turbo
is now reliable at selection actions.Click here to reveal the write-up on action selection.
> ## How `auto-v1` codegen works > > Here is an example of a successful action selection call: > > >Prompt
> > > > > > ``` > > You are about to make a decision on what to do next. > > > > This is the issue that was opened: > > ```#20 Create a dice rolling bot > > > > irgolic: In the `dice_rolling_bot` folder, develop a Discord bot that allows users to roll virtual dice by typing a command (e.g., "!roll 1d6"). The bot should support various dice types (e.g., d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. Include a README.md file with information on the bot's features, setup instructions, and usage guidelines. > > > > > > > > ``` > > > > This is the pull request we're creating: > > ```Title: Feature/dice-rolling-bot: Add Dice Rolling Bot Functionality > > > > This PR resolves issue #20, adding a dice rolling bot to the `dice_rolling_bot` folder. This Discord bot allows users to roll virtual dice by typing a command, such as "!roll 1d6". The bot supports various dice types (d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. A README.md file has been included with information on the bot's features, setup instructions, and usage guidelines. > > > > Commits: > > 1. Create dice_rolling_bot.py > > 2. Add README.md for dice rolling bot > > > > Please review the changes and let me know if any modifications are required. > > > > 1. Commit: Create dice_rolling_bot.py > > Files: dice_rolling_bot/dice_rolling_bot.py:L1-L23 > > Changes: > > - Created a new file `dice_rolling_bot.py` > > - Imported required libraries (random and discord) > > - Implemented the main functionality for the bot to roll dice based on user input > > 2. Commit: Add README.md for dice rolling bot > > Files: dice_rolling_bot/README.md:L1-L32 > > Changes: > > - Created a new file `README.md` > > - Added setup instructions for the bot > > - Provided usage guidelines and examples for the dice rolling bot > > ``` > > > > This is the commit we're writing: > > ```Create dice_rolling_bot.py > > > > - Created a new file `dice_rolling_bot.py` > > - Imported required libraries (random and discord) > > - Implemented the main functionality for the bot to roll dice based on user input``` > > > > This is the codebase subset we decided to look at: > > `````` > > > > These are the actions we've taken so far: > > `````` > > > > In our pursuit of implementing this commit, please make a decision on what to do next. > > If a new file should be created, respond with `new_file` and the file path and description. > > If one of the files in the codebase subset should be edited, respond with `edit_file` and the file path, description, and line numbers of the hunk to edit. When picking the line numbers, keep the hunk as short as possible while fulfilling the description. If you don't know which line numbers to pick, leave them blank. > > If you're done, respond with `finished` and the commit message describing the past actions. > > ``` > > > > > > Given below is XML that describes the information to extract from this document and the tags to extract it into. > > > > ```xml > > > > ``` > > > > > > ONLY return a valid JSON object (no other text is necessary), where the key of the field in JSON is the `name` attribute of the corresponding XML, and the value is of the type specified by the corresponding XML's tag. The JSON MUST conform to the XML format, including any types and format requests e.g. requests for lists, objects and specific types. Be correct and concise. > > > > Here are examples of simple (XML, JSON) pairs that show the expected behavior: > > - `
` => `{"bar": ['STRING ONE', 'STRING TWO', etc.]}` > > - `` => `{'baz': {'foo': 'Some String', 'index': 1}}` > > > >File creation and editing
How
auto-v1
codegen worksHere's an example of a successful new file generation:
Notice that the LLM reflects on its change in the "outcome" JSON key. This is fed back into subsequent actions to inform what's been done so far.
Here's an example of a successful file edit:
Notice how the hunk to be rewritten is prefixed with
*
with some surrounding lines of context.What's preventing gpt-3.5-turbo from working properly?
It's really bad at generating something in the format of:
It commonly drops the closing backtick of the code, leaves the
<code>
empty, or repeats the description of the content instead of writing actual code. Omitting the trailing metadata will probably improve performance by a lot, as well as adding some relevant few-shot examples.Also, it's really bad at following the instruction of rewriting only the code prefixed by
*
. It will just give you the full file/include the context lines.An example exhibiting both these qualities:
What can we do about it?
Here are some ideas:
gpt-3.5-turbo
to generate a code block enclosed by backticks, without the trailing metadata JSON. Alternatively, try using specific few-shot examples.gpt-3.5-turbo
's file edit action to only rewrite whole files