Significant-Gravitas / AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
https://agpt.co
Other
168.87k stars 44.48k forks source link

still looping. #3855

Closed GreyEnt closed 1 year ago

GreyEnt commented 1 year ago

⚠️ Search for existing issues first ⚠️

Which Operating System are you using?

MacOS

Which version of Auto-GPT are you using?

Latest Release

GPT-3 or GPT-4?

GPT-4

Steps to reproduce 🕹

scrape google for refernces to xxxx and creat a text document called xxxx, update it as you find results

Current behavior 😯

loops endlessly checking the itegrity of the reference document, doesn't stre any info and loooooooooooops

Expected behavior 🤔

a text doc with info on it about subject

Your prompt 📝

No response

Your Logs 📒

No response

bruno90x commented 1 year ago

me too loops a lot diffwrents task do i ask to complicate things?

ozzymuppet commented 1 year ago

Me three. I was optimistic when I saw them claime V3 worked reliably, but I found it just as hopeless as previous versions in doing a basic search for information and writing it to a text file. It overcomplicates everything, gets into a loop about specifics that are unrelated, and eventually errors out.

eyalk11 commented 1 year ago

Do you use redis? Does it keeps using the same index? BTW, my pull request https://github.com/Significant-Gravitas/Auto-GPT/pull/3914 deals with looping it by stopping when it happens (the user can direct it to do something else).

Mitman93 commented 1 year ago

No, it's not just you. I will +1 that they royally screwed up this project somehow. I've been messing around with it for a little over a month now, and in my experience the best performing version by far has been 0.2.2 + unsandboxing the file_operations.py. With that version combined with a reddit user named stunspot's workflow, I was able to get it to research and code its own modules that actually functioned (it just couldn't execute them by itself; they had to be ran manually). You can see my thread on it here: https://www.reddit.com/r/AutoGPT/comments/12ohmmo/uhhhits_a_little_terrifying_how_easy_it_was_to/

Now? It does nothing but loop endlessly on research. I just gave it 10 prompts (y -10), right? The given task was to make a disposable vape vending machine business plan. Not only did it just go to 5 webpages and find nothing of interest...but it then went to those same webpages AGAIN, completely wasting a bunch of tokens. It routinely does this now for some reason, yet the 0.2.2 version with a modified file_operations.py never did that. It had its bugs, sure, such as its inability to execute python as I stated earlier. But in terms of having an efficient workflow + actually getting stuff done, it's no contest. Here's hoping they course correct for the next release, because until then I'll be sticking with 0.2.2.

I'm using GPT-4, btw. --gpt4only

ozzymuppet commented 1 year ago

Agreed. All it does is seems to endlessly generate plans for doing various abstract stuff, 90% of which is unrealistic at present. It's like a government, but worse!

Someone leading this project needs to get 5 of their best developers who are dedicated to the project, and get them focused on some basic use-cases which are practical, useful, and not too abstract, such as...

Here's 3 off the top of my head...

  1. Creating a basic website based on a supplied topic that could be uploaded via FTP and just work. Simple HTML/CSS/JS with a form, a topic, some generated images, that's all.
  2. Taking a topic (from human, some keywords, sample data) then have it go out and researching it, scrape some data (eg. ebay product titles, descriptions, meta), interpret some data (eg. articles, wikipedia) then generate an output with some columns in CSV format.
  3. Generate a novel with 10 chapters, first establishing beginning, middle, end, then fleshing it out, generating some images, etc.

On Sun, May 7, 2023 at 4:26 PM Mitman93 @.***> wrote:

No, it's not just you. I will +1 that they royally screwed up this project somehow. I've been messing around with it for a little over a month now, and in my experience the best performing version by far has been 0.2.2 + unsandboxing the file_operations.py. With that version combined with a reddit user named stunspot's workflow, I was able to get it to research and code its own modules that actually functioned (it just couldn't execute them by itself; they had to be ran manually). You can see my thread on it here: https://www.reddit.com/r/AutoGPT/comments/12ohmmo/uhhhits_a_little_terrifying_how_easy_it_was_to/

Now? It does nothing but loop endlessly on research. I just gave it 10 prompts (y -10), right? The given task was to make a disposable vape vending machine business plan. Not only did it just go to 5 webpages and find nothing of interest...but it then went to those same webpages AGAIN, completely wasting a bunch of tokens. It routinely does this now for some reason, yet the 0.2.2 version with a modified file_operations.py never did that. It had its bugs, sure, such as its inability to execute python as I stated earlier. But in terms of having an efficient workflow + actually getting stuff done, it's no contest. Here's hoping they course correct for the next release, because until then I'll be sticking with 0.2.2.

I'm using GPT-4, btw. --gpt4only

— Reply to this email directly, view it on GitHub https://github.com/Significant-Gravitas/Auto-GPT/issues/3855#issuecomment-1537326587, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYVIAUTIWG242OFPT3S573XE46BJANCNFSM6AAAAAAXXLSG7I . You are receiving this because you commented.Message ID: @.***>

Mitman93 commented 1 year ago

I believe I mispoke on the version #, btw. It was 0.2.0 I had tremendous success with, not 0.2.2. Whatever the version before they introduced DevTools for searching instead of just using the "browse_website" command. I just confirmed it. Check this out.

This is the worthless output I got from 0.3.0:

Using Browser: chrome THOUGHTS: I need to read the specs.txt and advice.txt files to start creating the disposable vape business. REASONING: Reading the files will provide me with the necessary information and guidelines to follow. PLAN:

DevTools listening on ws://127.0.0.1:61724/devtools/browser/443d8fc4-ecec-4d8e-ae96-1d2d8430a5c5 Text length: 28918 characters Adding chunk 1 / 2 to memory Summarizing chunk 1 / 2 of length 16269 characters, or 2967 tokens Added chunk 1 summary to memory, of length 763 characters Adding chunk 2 / 2 to memory Summarizing chunk 2 / 2 of length 12652 characters, or 2487 tokens Added chunk 2 summary to memory, of length 421 characters Summarized 2 chunks. SYSTEM: Command browse_website returned: ("Answer gathered from website: The text does not specifically mention disposable vape vending machines, their target market, competitors, or market trends. However, it provides an overview of the Vending Machine Operators industry in the US, which primarily sells various goods through automated machines. Key market segments include manufacturing sites, offices, retail sites, schools, colleges, military bases, correctional facilities, hospitals, nursing homes, hotels, and motels. The industry faces competition from other retail channels and is affected by changing consumer preferences, such as increasing health consciousness. Revenue for vending machine operators is expected to fall at a CAGR of 0.1% to $10.1 billion over the five years to 2023, with profit set to grow to 4.3%. \n \n Links: ['\n (https://www.ibisworld.com/)', 'account_circle (https://my.ibisworld.com)', 'CONTACT US (https://www.ibisworld.com/contact-us/)', 'REQUEST DEMO (https://www.ibisworld.com/free-demo/)', 'Log In (https://my.ibisworld.com)']", <selenium.webdriver.chrome.webdriver.WebDriver (session="c6866f43be816fd9ce48700e532ba62d")>) THOUGHTS: I will search for more specific information on disposable vape vending machines and their target market, competitors, and market trends. REASONING: The previous search did not provide specific information on disposable vape vending machines. A more focused search may yield better results. PLAN:

DevTools listening on ws://127.0.0.1:62192/devtools/browser/d90fd7aa-1359-45d4-9f76-066dd4352a65 Text length: 4143 characters Adding chunk 1 / 1 to memory Summarizing chunk 1 / 1 of length 4144 characters, or 962 tokens Added chunk 1 summary to memory, of length 364 characters Summarized 1 chunks. SYSTEM: Command browse_website returned: ("Answer gathered from website: The text does not provide information about the target market, competitors, and market trends for disposable vape vending machines. Instead, it discusses various aspects of vaping, such as the best vape products, reviews, and news related to vaping, e-cigarettes, and CBD. It also mentions a few studies and regulations related to vaping in different countries. \n \n Links: ['\n\n (https://www.facebook.com/thevapingpost/)', '\n\n (https://www.linkedin.com/company/vaping-post)', '\n\n (https://fr.pinterest.com/thevapingpost/)', '\n\n (https://twitter.com/vaping_post)', ' (https://www.vapingpost.com/2019/08/29/vape-vending-machines-a-growing-trend/)']", <selenium.webdriver.chrome.webdriver.WebDriver (session="749d33e815aa930ddf30c828bc774511")>) THOUGHTS: I will try another website to find information on disposable vape vending machines and their target market, competitors, and market trends. REASONING: The previous searches did not provide specific information on disposable vape vending machines. Trying another website may yield better results. PLAN:

DevTools listening on ws://127.0.0.1:62519/devtools/browser/19ffd3a4-8d29-445d-b7bc-32b465656364 Text length: 4213 characters Adding chunk 1 / 1 to memory Summarizing chunk 1 / 1 of length 4217 characters, or 920 tokens Added chunk 1 summary to memory, of length 312 characters Summarized 1 chunks. SYSTEM: Command browse_website returned: ("Answer gathered from website: The given text does not provide information about the target market, competitors, and market trends for disposable vape vending machines. Instead, it focuses on the products and accessories offered by VaporFi, a vape company, along with some warnings and legal information related to vaping and e-liquids. \n \n Links: ['\nSkip to content (https://www.vaporfi.com/blog/vape-vending-machines/#content)', '\n\n \nFREE SHIPPING ON ORDERS $65+\n (https://www.vaporfi.com/blog/vape-vending-machines/)', '\nSUPPORT\n (https://www.vaporfi.com/customer-support/)', '\n\n \nSTORE LOCATOR\n (https://www.vaporfi.com/vape-shops/)', '\nMY ACCOUNT\n (https://www.vaporfi.com/customer/account/)']", <selenium.webdriver.chrome.webdriver.WebDriver (session="e9ebe4d3938936f9239cc4263391c3b7")>) THOUGHTS: I will try a different approach to find information on disposable vape vending machines and their target market, competitors, and market trends. REASONING: The previous searches did not provide specific information on disposable vape vending machines. Trying a different approach may yield better results. PLAN:


Not going to paste the whole thing because it's pretty long, but yeah. It basically just visited 5 websites, says "The previous searches did not provide specific information on disposable vape vending machines. Trying a different approach may yield better results."...AND THEN IT DOES THE SAME EXACT THING AGAIN. IT EVEN REVISITS THE SAME 5 WEBSITES.

Now, you ready for this? Here's that older version I was talking about. Same exact prompt. Same exact given goals. Same everything, but the version:


NEXT ACTION: COMMAND = browse_website ARGUMENTS = {'url': 'https://www.harvin.eu/blog-news/disposabe-vape-vending-machines/', 'question': 'What is the business overview of disposable vape vending machine businesses?'} Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for DevGPT... Input:y -10 -=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-= Text length: 11018 characters Adding chunk 1 / 2 to memory Summarizing chunk 1 / 2 Added chunk 1 summary to memory Adding chunk 2 / 2 to memory Summarizing chunk 2 / 2 Added chunk 2 summary to memory Summarized 2 chunks. SYSTEM: Command browse_website returned: Website Content Summary: "Result" : The text provides an overview of disposable vape vending machines, which are gaining popularity as a profitable solution for sales outlets selling smoking articles. Disposable vapes are electronic cigarettes with a limited number of puffs, and they are distinguishable from other types of electronic cigarettes. The Italian company Harvin produces disposable vape vending machines and aims to help customers boost sales and expand their business horizons. The potential for CBD disposable vapes vending machines in Italy and other European countries is currently limited due to legal restrictions. Links: ['Skip to content (#content)', 'info@harvin.eu (mailto:info@harvin.eu)', '+39 030 219 0663 (tel:#)', 'Facebook (https://www.facebook.com/harvindistributoriautomatici)', 'Youtube (https://www.youtube.com/channel/UCnhHyoZnyigjqgKsVayrSTQ)'] DEVGPT THOUGHTS: Now, I will create a draft summary for the business overview section using the information collected. REASONING: Creating a draft summary will provide a foundation for the disposable vape vending machine business overview. PLAN:

Again, not going to post the entire thing because it's super long, but let me just say...WOW. It's already successfully researched several areas of the business, drafted preliminary proposals, and is currently looking for reputable suppliers. In other words: IT'S ACTUALLY DOING WHAT I TOLD IT TO DO! IT'S ACTUALLY WORKING!! Woohoo!

Whenever they stopped using the "browse_website" command in favor of that Devtools thing...that appears to be what is causing all the search loop headaches.

Boostrix commented 1 year ago

Whenever they stopped using the "browse_website" command in favor of that Devtools thing...that appears to be what is causing all the search loop headaches.

If you can, you should consider sharing some of the settings that work/don't work for you for debugging purposes - if you are concerned about sharing "secrets", you can obviously change the topic/keywords of your research.

I am primarily interested in the underlying workflow to see what can be improved to get it stop unnecessary looping, and to work out a scheme for tasking/workspace recovery (continuation) that works.

github-actions[bot] commented 1 year ago

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] commented 1 year ago

This issue was closed automatically because it has been stale for 10 days with no activity.