Closed ImanSharaf closed 1 month ago
Well, the topic is on a rise, I agree.
But how do you actually verify that there is no prompt injection? Is the requirement actionable or is it more on the "make it all secure" side?
If an attacker is able to manipulate an AI system into unauthorized data access or other malicious activities by directly interacting with the raw AI prompt, this is known as prompt injection. Mitigating this risk involves multiple layers of security controls:
Strong Input Validation: Implement rigorous validation mechanisms, potentially leveraging another AI model trained specifically for this purpose, to scrutinize incoming prompts.
Model Hardening: Enhance the resilience of the AI model to withstand malicious or malformed inputs, possibly through techniques like adversarial training.
Attribute-Based Access Control (ABAC): Employ fine-grained access controls based on attributes like user role, data sensitivity, and context to restrict who can interact with the AI system and how.
Parameterized AI Calls: Similar to using prepared statements in SQL to prevent SQL injection, use parameterized calls to the AI model to segregate user input from the command logic, thereby reducing the risk of injection attacks.
My initial thought is do we want a specific LLM section in the V5 Validation, Sanitization and Encoding chapter?
Or do LLMs need their own *SVS?
I bet @danielcuthbert has opinions about this :)
Given our objectives, if we're primarily focusing on applications that integrate LLMs (especially when the LLM engine interacts with sensitive data like backend databases), it would be fitting to embed LLM-related guidelines within the ASVS framework. This approach provides a holistic view of application security, encompassing all components including LLMs.
However, if our emphasis shifts to the security and intricacies of the LLM itself, independent of its application context, it's advisable to create a dedicated *SVS. This would allow us to delve deeper into the unique challenges and security nuances of LLMs as standalone entities.
Thus, our decision should align with our primary objective and the depth of scrutiny we wish to apply to LLMs.
I definitely see there being benefit in keeping in mind applications that integrate LLMs as opposed to the LLM itself. We've been getting a fair number of question lately on the security of applications using LLMs, and I see that becoming even more common.
Or do LLMs need their own *SVS?
I concur and was hoping this could be achieved (Slack context here), but I'd explicitly label this as "LLM Applications *SVS
" which i'd love the ability to collaborate on this.
My reasoning is that security standards on building, monitoring and securing LLM's may overlap with LLM Applications but is ultimately at a high-level completely different. Whilst their vulnerabilities related to both scenario's, their mitigations and remediations are mainly unique. An LLM Application can encompass all of the factors with a general web application that sits as an overlay wrapper. I.E, also assuming the LLM is not well versed, your essentially adding an entity layer which can make decisions, deploy functions or code and take other actions within your environment.
One example why I think these should be split into two different frameworks: (which I mentioned in BugCrowd LLM AppSec VRT thread, I requested from them for NLP or LLM-based programs)
There are already vulnerability DB's, frameworks and|or security best practices for MLSecOps, not related to LLMAppSec, I.E:
https://avidml.org/ https://huntr.mlsecops.com/
I'd say that MITRE ATLAS is the closest to what is required from the community, it does not "show an ML developer how to build a secure LLM application" as compared to a web application developer would use the classic ASVS. Within the OWASP Top 10 for Large Language Model Applications, one thing I am working on for v2 is mapping CWE's and CVE's along frameworks within an LLM Application environment (WIP GitHub issue here).
I think a new AI section is fundamental for ASVS 5.
Suggestions:
Verify Secure Data Handling in AI Models
Verify Robustness of AI Models Against Adversarial Attacks
Verify Transparency and Explainability of AI Decisions
Verify Compliance with Ethical Guidelines and Regulations
Verify the Integrity of AI Model Training and Updates
I'll be honest, I don't know how relevant the suggestions proposed by @jmanico are to the ASVS - they seem like they would be best appropriate for an AI-specific SVS. I think our goal should be addressing LLM integration, not a model itself. I absolutely agree that there should be some AI *SVS, but adding it as a section to the ASVS does not feel like the correct answer.
That being said, some feedback on the suggested requirements: primarily, we should be sure that controls actually exist for the requirements put forth. It would be disingenuous to have a security requirement that can't actually be met. I say this because I know it can be incredibly difficult to protect a model from adversarial attacks (from my understanding, this requires a defense in depth approach), but I'm also not an expert here. I think it would also be appropriate to add a requirement to protect against extraction attacks to copy the AI model in use, but that may be covered by the "detect and mitigate adversarial attacks" requirement.
Yeah I think that especially based on what @GangGreenTemperTatum said, I am not convinced that a full section on LLMs is warranted for ASVS. On the other hand, maybe we could include some guidance in an appendix which I know we sort of did in the past for IoT?
@jmanico what are your thoughts on an appendix? Do you have more items you would add? I would suggest that we change the items in your original comment above to be more like considerations rather than verification requirements as I agree with @EnigmaRosa that we need to be careful about whether practical solutions exist to some of these considerations :)
An appendix works for me 🤙
An appendix works for me too.
Okay so we now have a place to put appendix content with an intro that I based on what @ImanSharaf wrote so who wants to add content :)
https://github.com/OWASP/ASVS/blob/master/5.0/en/0x98-Appendix-W_LLM_Security.md
Thank you, I will do that.
I would also add a requirement around natural language validation is fairly essential. I see folks using a smaller AI engine (with no user data) to do natural language validation.
I would also add a requirement around natural language validation is fairly essential. I see folks using a smaller AI engine (with no user data) to do natural language validation.
@jmanico can you open a PR in the new appendix for this?
I added a few other AI Security requirements in the PR attached to this issue.
I have made this non-blocking. Jim has PR'd in some content and @ImanSharaf it would be great to get some extra content from you as well.
@tghosth Should we talk about this package hallucination attack in the appendix too?
Also, what do you think about this check Ensure that any personal data processed by the LLM is anonymized or pseudonymized to protect user privacy.
@tghosth Should we talk about this package hallucination attack in the appendix too?
Maybe in output filtering?
Also, what do you think about this check
Ensure that any personal data processed by the LLM is anonymized or pseudonymized to protect user privacy.
I think we can suggest that too
@ImanSharaf, if you help me craft requirements for these two issues, I'm happy to add them to the work done here. https://github.com/OWASP/ASVS/blob/master/5.0/en/0x98-Appendix-W_LLM_Security.md
@ImanSharaf, if you help me craft requirements for these two issues, I'm happy to add them to the work done here. https://github.com/OWASP/ASVS/blob/master/5.0/en/0x98-Appendix-W_LLM_Security.md
For the hallucination attack, the target is typically the developer. It's crucial for developers to approach results from Large Language Models (LLMs) with skepticism, as these models can inadvertently generate vulnerable code or recommend non-existent packages. Attackers might exploit these suggestions by creating malicious packages mirroring the hallucinated results. Developers should always verify and validate LLM outputs before implementation to mitigate these risks. I don't know where we should put this check. @jmanico @tghosth
If the check is AI specific then I suggest we just add it to a new section on secure code generation in the AI appendix
Does it work? The organization must ensure that all outputs from Large Language Models (LLMs) used during the software development process, including but not limited to code suggestions, package recommendations, and configuration snippets, are subject to verification and validation by developers before implementation.
How about:
Verify that all outputs from Large Language Models (LLMs) used during the software development process, including code suggestions, package recommendations, and configuration snippets, are subject to verification and validation by developers before implementation.
Hey @tghosth do you like this? If so I'll go PR.
@jmanico sure go for it, which section in the appendix?
W.2 Output Filtering for now. I may move this to a AI Code Generation section later! PR Submitted!
How about:
Verify that all outputs from Large Language Models (LLMs) used during the software development process, including code suggestions, package recommendations, and configuration snippets, are subject to verification and validation by developers before implementation.
Hey @tghosth do you like this? If so I'll go PR.
Looks good!
W.2 Output Filtering for now. I may move this to a AI Code Generation section later! PR Submitted!
Merged!
Assuming the llm-verification-standard isn't dead... Should we not at least add a reference to it in W?
Great point @mgargiullo! I opened #2149 and I think we will close this issue after that
(Ed note, original issue title was: Prevention of Prompt Injection in Applications Using Large Language Models (LLM))
The popularity of Large Language Models (LLM) like GPT variants from OpenAI has been on a rise, giving applications the power to generate human-like text based on prompts. However, this has opened up potential vulnerabilities. Malicious users can manipulate these LLMs through craftily structured inputs, causing unintended actions by the LLM. These manipulations can be direct, where the system prompts are overwritten, or indirect, where inputs are manipulated from external sources.
I believe that applications that are using LLMs must validate and sanitize all input prompts to prevent any malicious or unintended manipulations.