Closed kfsone closed 1 month ago
@kfsone, as you say, there are risks. What's your recommendation for mitigating them?
I agree that relying on Autogen to write and run code to fetch urls, parse pages, and do web searches, is suboptimal -- not necessarily for security reasons, but mainly because it increases the risk of failure. Simply stated: more things need to go right for the task to succeed.
I have been thinking about creating a stateful WebSurferAgent that can conduct searches and read pages, similar to https://openai.com/research/webgpt, but I've been too busy on evaluation to get to it yet. I would be very pleased if someone else took this on.
Not specifically on the topic of safety but related, I had to integrate precautions in the assistant to handle basic things like avoiding a site that can’t be scraped or could also be a site to avoid because of low content quality. I’m also mitigating this by additional logic in the search function the assistant uses. On the security side, perhaps some integration with something like Open Threat Exchange or another source of threat intelligence.
@rickyloynd-microsoft Obviously it's a long hose that never runs dry, but I think the most likely source of risk atm is going to be unskilled/inexperienced people, say youtubers or people in or looking to be in government, coming along, firing up autogen on a machine and asking it to do something that requires web access. They'll likely use an LM or something and grab a model or 3 from huggingface.
They'll land on https://github.com/microsoft/autogen/blob/main/notebook/agentchat_web_info.ipynb and perhaps not understand that what it's doing is building its own web-access mechanism from code the models generate. That makes which model you use really significant, and I don't imagine many people - even your actual target audience for autogen - will intuit that.
What I'm suggesting here is really just Stage 0, adding an agent/feature/capability to autogen that provides the web-retrieval capability similar to the way you have an executor capability, and the notebook linked would be the primer for how to use that.
I've had to learn some tough lessons about accidental and intentional user abuse over the decades(*), so I've developed a reasonable spidey-sense for "this stupid tiny side branch is going to explode and break the project", and LM-will-use-lm-to-write-web-access-capability feels like a setup for one of those.
(* a small sampling: https://www.deafblind.com/dbtechies4.html 'web by mail', oh how naive I was; https://web.archive.org/web/19990208003211/http://about.warbirds.org/ allow people to create mail aliases and mailing lists for free, how could that go wrong, although it never actually got exploited while I was running it; https://web.archive.org/web/19980703072207/www.kfs.org/tools.html web accessible dns/ping/traceroute tools via cgi?)
@kfsone I'll make sure adding you as a reviewer if I see a PR for it. At least, we should remove "Web Search" from the example prefix. The example doesn't really perform web search.
Is this something we still want to move forward with? This will absolutely increase the risk of failure as it could break on so many things. That given agents need a way to search the internet. If we agree on what an MVP would be, I could look into building a first version.
Is this something we still want to move forward with? This will absolutely increase the risk of failure as it could break on so many things. That given agents need a way to search the internet. If we agree on what an MVP would be, I could look into building a first version.
Reliable web search is implemented in the current WebSurfer agent, and greatly expanded in the update I am preparing in #1929: https://github.com/microsoft/autogen/blob/headless_web_surfer/autogen/browser_utils/markdown_search.py
This uses a BING_API_KEY when available, or else falls back to scraping. https://github.com/microsoft/autogen/blob/852ee3375bca61fc1d0c004060439d0b4a906aad/autogen/browser_utils/mdconvert.py#L363-L426
@afourney nice work. This looks fun to use! Why did you choose to use markdown?
@afourney nice work. This looks fun to use! Why did you choose to use markdown?
The first Web Surfer uses Markdown, so that's why the search integration uses markdown.
But then the question is why use Markdown for web surfer? Well:
@afourney yes that makes sense, Il recently worked on a system that was converting docs to MD and then sending it to an LLM. I converted it to text instead of MD and noticed the performance went down because the LLM lost meaning so I converted it back to MD!
While most LMs will allow autogen to self-implement something to fetch URLs, it's absolutely not guaranteed and certainly not guaranteed to be safe. Looking at the patterns and rate of growth of models on HF there are probably already nefariously manipulated models being used already.
Given code execution is involved, exploits are inevitable, but having a web-capable baseline agent/facility might help delay and reduce risk exposure.