OWASP / www-project-top-10-for-large-language-model-applications

OWASP Foundation Web Respository
Other
453 stars 119 forks source link

Enhancement Suggestion: Add RAG to the main diagram #240

Open jsotiro opened 8 months ago

jsotiro commented 8 months ago

Retrieval augmented generation (RAG) is technique to enrich LLMs with own data. It has become very popular as it lowers the complexity entry to enriching input in LLM apps, allows for better access controls as opposed to fine tuning, and is known to reduce hallucination (see https://www.securityweek.com/vector-embeddings-antidote-to-psychotic-llms-and-a-cure-for-alert-fatigue/) see also the excellent Samsung paper on enterprise use of GenAI and the role of RAG.

RAG creates its own security risks and adds to the attack surface. yet, the diagram only includes fine tuning, We should add explicitly RAG as part of our diagram and annotate it with related LLM items. Some useful links: architectural approaches Azure: https://github.com/Azure/GPT-RAG AWS SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html AWS Bedrock RAG workshop: https://github.com/aws-samples/amazon-bedrock-rag-workshop

security concerns: Security of AI Embeddings explained Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models Embedding Layer: AI (Brace For These Hidden GPT Dangers)

GangGreenTemperTatum commented 8 months ago

Thanks @jsotiro ! Confirming receipt and self-assigned. I also had a similar vision and will review / work with you to take this on. Appreciate the input and feedback too!!

GangGreenTemperTatum commented 8 months ago

Confirming I have access to our source code of v1.1 diagram artifact and will progressively work on this as well as other elements for improvement

ianand commented 8 months ago

https://openreview.net/pdf?id=wK7wUdiM5g0 is another paper that might be worth citing in this area

pkethana commented 8 months ago

+1 on adding RAG and connecting sensitive data stores to the LLM models is becoming a very popular pattern for building AI applications.

pkethana commented 8 months ago

Will be happy to provide feedback or participate in the diagram creation.

GangGreenTemperTatum commented 8 months ago

Hey @Bobsimonoff ,

Super gross formatting, but getting my idea about what we discussed and hope it can help with some conceptualization

image

image

Bobsimonoff commented 8 months ago

@GangGreenTemperTatum In the above pictures, I see a number of the Top 10 risk annotations have been removed. All risks attached to the arrows between LLM Production Services and plugins/extensions (sensitive information disclosure, insecure output handling, overreliance, insecure plug-in design, prompt injection (indirect), model denial of service) and risks associated with the LLM Model (Model Theft and Training Data Poisoning).

These removals deliver it or was it more just to get your thoughts down to be able to include the purple text boxes?

GangGreenTemperTatum commented 8 months ago

Sorry @Bobsimonoff , pretty much ignore everything but the purple boxes, not requesting those edits of LLM annotations etc

The reason for the cuts was a simplified abstract from the main image for simple high level threat modelling and labeling some obvious trust boundaries.

pkethana commented 7 months ago

@GangGreenTemperTatum How should I read these purple areas and "Downstream services". Are these downstream services applicable to the left purple area as well? RAG is heavily used with Application services. So, I would like to see if these down stream services connected to Application services as well. @Bobsimonoff

Bobsimonoff commented 7 months ago

Here is a thought for the updated diagram. I have added a few more trust boundaries. I tried to remain true to not naming products/components/libraries by name so as to avoid favoritism. We could probably debate the trust boundaries, but here is something at least for discussion.

OWASP Top 10 for LLM Applications - Presentation

pkethana commented 7 months ago

@GangGreenTemperTatum @Bobsimonoff Any thoughts on the RAG applicability to Application services? I believe RAG is typically used in Application Services. In fact Sam Altman claimed that plugins did not see product-market fit beyond browsing. That might change with the agent's introduction.

Bobsimonoff commented 7 months ago

I think many companies, @GangGreenTemperTatum, are also driving their RAG off of Langchain and similar capabilities, which would be more in the Automation box. Also that quote was from June now there are literally hundreds of paid plugins/add-ons, so I am not sure his quote aged well.

pkethana commented 7 months ago

Here are some examples of applications that do RAG as part of pre-processing:

  1. Microsoft Co-pilot
  2. Salesforce Einstein GPT I have also attached screen shots from these links for your reference Einstein

    Microsoft co-pilot

Regarding the plugins, There are multiple threads that indicate Custom GPTs/agent APIs are replacing plugins. Here are some examples:

  1. https://community.openai.com/t/have-plugins-been-replaced-completely/475694
  2. In this podcast summarizing OpenAI dev day (https://api.substack.com/feed/podcast/1084089/private/ce0c8b05-cc74-469e-9828-17ba420f323f.rss). [01:13:59] Surya Dantuluri (Stealth) - RIP Plugins

I expect RAG to be used everywhere, Agents, Plugins and more commonly, applications and Orchestrators like Lang chain are going to make them simpler and are embedded as SDKs into applications.

Current diagram indicates that application only interfaces with Fine-tuned and training data, which is misleading

There are three types of models that AI application can use:

  1. Base or Foundation Models (GPT, Llama2)
  2. Fine-tuned models (trained with specific data sets on top of Foundation models)
  3. Custom models built from scratch

The current picture does not show this variety. I suggest expanding LLM to include these 3 varieties by adding 3 types of boxes. Since Fine-tuned data is mainly the input of Fine-tuned models and Training data is the input to custom models. My suggestion is to add the lines to those model boxes and remove from the application.

To account for the applications using Data stores during pre-processing at inference time, My suggestion is to generalize "Down stream services" as "Retrieval services or some other meaningful name" and connect them from applications as well as plugins.

Would be happy to jump on a call to discuss this further. @Bobsimonoff @GangGreenTemperTatum

GangGreenTemperTatum commented 5 months ago

@Bobsimonoff @NerdAboutTown you happy taking this one as part of v2?