Open jsotiro opened 1 year ago
heya @jsotiro , doing some housekeeping on the repo, did you get chance to bring this up and is anything outstanding needed? i also have this issue which basically superseeds i believe as i also mentioned different API architectures and mediums
Retrieval augmented generation (RAG) is a technique to enrich LLMs with the apps/org own data. It has become very popular as it lowers the complexity entry to enriching input in LLM apps, allows for better access controls as opposed to fine-tuning, and is known to reduce hallucination (see https://www.securityweek.com/vector-embeddings-antidote-to-psychotic-llms-and-a-cure-for-alert-fatigue/) see also the excellent Samsung paper on enterprise use of GenAI and the role of RAG.
Currently, we have occasional references to RAG but we should create a threat model and discuss this on its own and the impact on the LLM top 10. As RAG becomes a distinct enterprise PATTERN it also creates its own security risks and expands the attack surface with: This includes
Some useful links**
architectural approaches Azure: https://github.com/Azure/GPT-RAG AWS SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html AWS Bedrock RAG workshop: https://github.com/aws-samples/amazon-bedrock-rag-workshop
security concerns Security of AI Embeddings explained Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models Embedding Layer: AI (Brace For These Hidden GPT Dangers)