Create Threat Model and Discuss RAG with its security risks for LLM

Retrieval augmented generation (RAG) is a technique to enrich LLMs with the apps/org own data. It has become very popular as it lowers the complexity entry to enriching input in LLM apps, allows for better access controls as opposed to fine-tuning, and is known to reduce hallucination (see https://www.securityweek.com/vector-embeddings-antidote-to-psychotic-llms-and-a-cure-for-alert-fatigue/) see also the excellent Samsung paper on enterprise use of GenAI and the role of RAG.

Currently, we have occasional references to RAG but we should create a threat model and discuss this on its own and the impact on the LLM top 10. As RAG becomes a distinct enterprise PATTERN it also creates its own security risks and expands the attack surface with: This includes

a second process of generating embeddings that can vary in complexity and can be using the main or secondary LLM
vector database
the use of embeddings in prompts and responses
use of other services (eg Azure Cognitive Services) or specialised plugins, for instance, the OpenAI upsert retrieval plugin for semantic search plugin(https://github.com/openai/chatgpt-retrieval-plugin)

*Some useful links***

architectural approaches Azure: https://github.com/Azure/GPT-RAG AWS SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html AWS Bedrock RAG workshop: https://github.com/aws-samples/amazon-bedrock-rag-workshop

security concerns Security of AI Embeddings explained Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models Embedding Layer: AI (Brace For These Hidden GPT Dangers)

OWASP / www-project-top-10-for-large-language-model-applications

Create Threat Model and Discuss RAG with its security risks for LLM #241

*Some useful links***