Repository for training and deploying Generative AI models, including text-text, text-to-image generation and prompt engineering playground using SageMaker Studio.
MIT No Attribution
130
stars
88
forks
source link
Code for Retrieval Augmented Generation (RAG) question answering with Llama 2, LangChain and Pinecone using SageMaker Studio Notebooks #31
Description of changes:
This notebook shows users how to use SageMaker Studio to implement RAG for fast experimentation and later deploy their models to SageMaker endpoints
Implement RAG in the notebook
Load Llama-2 7B chat model from Hugging FAce and test question answering with LangChain
Confirm that adding context leads to performance improvements
Ingest external pdf files to Pinecone after converting them to embeddings with the bge-small model from Hugging Face
Ask a question and augment the prompt by adding the most similar document extracts from Pinecone as context
From experimentation to large scale deployment Deploy your models to SageMaker endpoints
Deploy llama-2 7b chat to a SageMaker Real-time endpoint
Deploy the embeddings model to a SageMaker real time endpoint
Ask the question and augment using LangChain again and augment the prompt. This time the request is to the SageMaker real time endpoint
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Description of changes: This notebook shows users how to use SageMaker Studio to implement RAG for fast experimentation and later deploy their models to SageMaker endpoints
Implement RAG in the notebook
From experimentation to large scale deployment Deploy your models to SageMaker endpoints
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.