OWASP / www-project-top-10-for-large-language-model-applications

OWASP Foundation Web Respository
Other
450 stars 119 forks source link

Missing app layers #117

Open vishwasmanral opened 11 months ago

vishwasmanral commented 11 months ago

One of the key ways LLMs/ generative aI is used is through chaining/ agents. Agents put in one of the biggest risks for applications based on LLM/ Generative AI. There are issues with users allowing machine access to agents to perform tasks and agents seem to work autonomously creating security holes.

rot169 commented 11 months ago

Agreed @vishwasmanral! :-) LLM08 (Excessive Agency) talks to this, although please do suggest some further specific enhancements if you feel like any key points are missing.

Bobsimonoff commented 10 months ago

I agree agents should be covered, and I think Excessive Agency makes sense.

I also think Overreliance may make sense since the agents may utilize and count on outputs from the LLM to make decisions about next steps. Overreliance fits because these outputs can not really be counted on to be accurate.

vishwasmanral commented 10 months ago

An issue with Agent is that users are putting agents and giving them admin permissions. Agents can now create and delete code on the machine itself, which makes it very interesting. Agents also lead to issues sending data to APIs that may be outside the controls.

For now there is a human in the loop for most cases so things are not as bad. :)

-Vishwas

On Tue, Sep 5, 2023 at 3:56 PM Bob Simonoff @.***> wrote:

I agree agents should be covered, and I think Excessive Agency makes sense.

I also think Overreliance may make sense since the agents may utilize and count on outputs from the LLM to make decisions about next steps. Overreliance fits because these outputs can not really be counted on to be accurate.

— Reply to this email directly, view it on GitHub https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/issues/117#issuecomment-1707415221, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT3TUS32KS3TAGNEABOUKDXY6U27ANCNFSM6AAAAAA3GFDUDI . You are receiving this because you were mentioned.Message ID: <OWASP/www-project-top-10-for-large-language-model-applications/issues/117/1707415221 @github.com>

rot169 commented 10 months ago

@vishwasmanral do you have any documented examples of agents with those kinds of excessive permissions? I'd love to be able to bring LLM08 to live with some real-world examples.

vishwasmanral commented 10 months ago

Hi Andy,

There are many. With the code interpreter agent by ChatGPT released a few days back, we have run commands, it does ask for permissions but doesn't do what it says it does (so it runs system commands etc).

Running the agent on a local machine currently, it does ask for permission before it runs a code or runs pip install on a machine, but with what we have seen (malicious or otehr reasons the issues can be bigger).

https://www.lesswrong.com/posts/KSroBnxCHodGmPPJ8/jailbreaking-gpt-4-s-code-interpreter the same issues as analyzed through a plugin.

Maybe try your hand at it and see how it works. It's fairly simple run interpreter on a machine (./interpreter) ask it to do some tasks, it will ask if it can install packages/ python tself if you do not have it installed, show you commands and scripts it will be running and then take things from there.

-Vishwas

On Thu, Sep 7, 2023 at 10:08 AM Andy @.***> wrote:

@vishwasmanral https://github.com/vishwasmanral do you have any documented examples of agents with those kinds of excessive permissions? I'd love to be able to bring LLM08 to live with some real-world examples.

— Reply to this email directly, view it on GitHub https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/issues/117#issuecomment-1710506789, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT3TUS2WJ7KJSEZVX52YFDXZH5P5ANCNFSM6AAAAAA3GFDUDI . You are receiving this because you were mentioned.Message ID: <OWASP/www-project-top-10-for-large-language-model-applications/issues/117/1710506789 @github.com>

vishwasmanral commented 10 months ago

https://github.com/KillianLucas/open-interpreter fork this and run it. This is the agent code.

-Vishwas

On Thu, Sep 7, 2023 at 2:41 PM Vishwas Manral @.***> wrote:

Hi Andy,

There are many. With the code interpreter agent by ChatGPT released a few days back, we have run commands, it does ask for permissions but doesn't do what it says it does (so it runs system commands etc).

Running the agent on a local machine currently, it does ask for permission before it runs a code or runs pip install on a machine, but with what we have seen (malicious or otehr reasons the issues can be bigger).

https://www.lesswrong.com/posts/KSroBnxCHodGmPPJ8/jailbreaking-gpt-4-s-code-interpreter the same issues as analyzed through a plugin.

Maybe try your hand at it and see how it works. It's fairly simple run interpreter on a machine (./interpreter) ask it to do some tasks, it will ask if it can install packages/ python tself if you do not have it installed, show you commands and scripts it will be running and then take things from there.

-Vishwas

On Thu, Sep 7, 2023 at 10:08 AM Andy @.***> wrote:

@vishwasmanral https://github.com/vishwasmanral do you have any documented examples of agents with those kinds of excessive permissions? I'd love to be able to bring LLM08 to live with some real-world examples.

— Reply to this email directly, view it on GitHub https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/issues/117#issuecomment-1710506789, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT3TUS2WJ7KJSEZVX52YFDXZH5P5ANCNFSM6AAAAAA3GFDUDI . You are receiving this because you were mentioned.Message ID: <OWASP/www-project-top-10-for-large-language-model-applications/issues/117/1710506789 @github.com>

vishwasmanral commented 9 months ago

Traditional issues (RCE) against agents and other middleware frameworks:

arxiv.org/abs/2309.02926 https://t.co/n38joBiu6p

"We discovered 13 vulnerabilities in 6 frameworks, including 12 RCE vulnerabilities ︄︄ and 1 arbitrary file read/write vulnerability. 11 of them are confirmed by the framework developers, resulting in the assignment of 7 CVE IDs."

"We amplify the attack impact beyond achieving RCE by allowing attackers to exploit other app users (e.g. app responses hijacking, user API key leakage) without direct interaction between the attacker and the victim"

-Vishwas

On Thu, Sep 7, 2023 at 2:58 PM Vishwas Manral @.***> wrote:

https://github.com/KillianLucas/open-interpreter fork this and run it. This is the agent code.

-Vishwas

On Thu, Sep 7, 2023 at 2:41 PM Vishwas Manral @.***> wrote:

Hi Andy,

There are many. With the code interpreter agent by ChatGPT released a few days back, we have run commands, it does ask for permissions but doesn't do what it says it does (so it runs system commands etc).

Running the agent on a local machine currently, it does ask for permission before it runs a code or runs pip install on a machine, but with what we have seen (malicious or otehr reasons the issues can be bigger).

https://www.lesswrong.com/posts/KSroBnxCHodGmPPJ8/jailbreaking-gpt-4-s-code-interpreter the same issues as analyzed through a plugin.

Maybe try your hand at it and see how it works. It's fairly simple run interpreter on a machine (./interpreter) ask it to do some tasks, it will ask if it can install packages/ python tself if you do not have it installed, show you commands and scripts it will be running and then take things from there.

-Vishwas

On Thu, Sep 7, 2023 at 10:08 AM Andy @.***> wrote:

@vishwasmanral https://github.com/vishwasmanral do you have any documented examples of agents with those kinds of excessive permissions? I'd love to be able to bring LLM08 to live with some real-world examples.

— Reply to this email directly, view it on GitHub https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/issues/117#issuecomment-1710506789, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT3TUS2WJ7KJSEZVX52YFDXZH5P5ANCNFSM6AAAAAA3GFDUDI . You are receiving this because you were mentioned.Message ID: <OWASP/www-project-top-10-for-large-language-model-applications/issues/117/1710506789 @github.com>