Significant-Gravitas / AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
https://agpt.co
MIT License
163.64k stars 43.45k forks source link

Reader mode access for AutoGPT #2181

Closed Ashwin-droid closed 9 months ago

Ashwin-droid commented 1 year ago

Duplicates

Summary πŸ’‘

AutoGPT really struggles in data scraping, I can confirm this because I have asked it a simple goal, StorytellerGPT has the target of giving me a single real-life inspirational story that I can record and upload to my podcast, one of its goals was to fact check the story. Meanwhile, it was scraping, it got some nonsensical code and it decided to write a python function to validate the story. Knowing that it is ridiculous I decided to stop the execution as it was running for quite a long. So I wanted to ask if is there a plan to add support for reader mode in chrome to back the process of scraping data. I was using MilvusMemory in Docker container as the vector database. I have attached the output in the examples section.

Examples 🌈

activity.txt

Motivation πŸ”¦

My motivation for requesting this feature is to improve the data scraping process of AutoGPT, which is a tool that I use for various tasks. I have encountered some problems with AutoGPT when it tries to scrape data from web pages that are not reader-friendly, such as pages with ads, distractions or incorrectly coded websites. This affects the quality and accuracy of the results that AutoGPT produces, and sometimes it even causes the tool to misfunction and write irrelevant code.

For example, I recently asked AutoGPT to generate a story based on the goal of StorytellerGPT, which is to fact-check the story before giving it to me. However, while it was scraping data from a web page, it encountered an improperly coded site. Instead of ignoring it, AutoGPT decided to write a python function to validate the story, which was obviously ridiculous and unnecessary. I had to stop the execution as it was running for too long and wasting resources. I have attached the output in the examples section.

This is why I would like to ask if there is a plan to add support for reader mode in Chrome to back the process of scraping data. Reader mode is a feature that removes clutter, ads and distractions from web pages and presents them in a simple and clean format . This would make it easier for AutoGPT to extract relevant information from web pages and generate better results. It would also save time and resources by avoiding unnecessary data and code.

I think this feature would be useful for many other users who use AutoGPT or similar tools for content creation or research purposes. It would enhance the performance and reliability of these tools and provide a better user experience. I hope you will consider adding this feature in the future.

Boostrix commented 1 year ago

What's probably needed is redoing browse_website as a plugin to address a number of related RFEs - this could then include supporting alternate search engines/APIs, but also actual browsing using pagination, and downloading to an offline cache for furher processing, using standard tools like wget/curl (or rather the corresponding python bindings). At that point, you'd be close enough to a crawler/spider to still chain things together: https://github.com/Significant-Gravitas/Auto-GPT/issues/503#issuecomment-1534094916

github-actions[bot] commented 10 months ago

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] commented 9 months ago

This issue was closed automatically because it has been stale for 10 days with no activity.

Boostrix commented 9 months ago

FWIW: This is one of those issues where I strongly disagree with closing it, it's perfectly plausible actually - and it could help us make better use of the context window, by having the input provided in a better/different format.