agentcoinorg / evo.ninja

A versatile generalist agent.
MIT License
1.06k stars 172 forks source link

Benchmarks: AnswerQuestionSmallCsv #257

Closed cbrzn closed 11 months ago

cbrzn commented 11 months ago

Overview

Challenge Info Evo Logs [1] [2]

Observations

Evo doesn't know from where get the information, even tho it is receiving a CSV file - It explicitly thinks that he doesn't know how to get this from, maybe Evo should look into the workspace to see if something from there might be useful? :thinking:

krisbitney commented 11 months ago

The task is "How much was spent on utilities in total ? Write the answer in an output.txt file." and there is a file called file1.csv available. I asked in the AutoGPT discord and they intended to leave the filename out of the task.

This is difficult for Evo. Evo consistently looks for a script like "utilities.total", and then tries to create one. Evo typically hallucinates a filename like utilities.csv and then fails to read it.

I attempted to resolve this by adjusting the prompt to help Evo understand to search for the file. I also tried adding a planning section to the prompt:

`Before your first response, you must take a deep breath, and make a plan.\n` +
`Tersely list the data you will need to achieve the goal and how you will obtain it.\n` +
`You can assume the data you need will be given to you, will be in a file or will be on the internet.\n` +
`You may need to search for the correct file or url.\n` +
`Next write a list of steps you will need to take to achieve the goal.\n` +
`Complete any additional planning steps you believe will help.\n` +
`Then try to accomplish the goal.\n` +
`You are a remarkable expert in planning and goal achievement. Do not ask the user for help.`;

And I added a script fs.searchFilesByContent:

{
  "name": "fs.searchFilesByContent",
  "description": "Searches through text-based files (e.g., .txt, .csv) to identify those that contain the keyword and returns a list of their filepaths.",
  "arguments": "{ keyword: string }",
  "code": "./fs.searchFilesByContent.js"
}

But I haven't been able to get Evo to look for the file.