Open Agent-E11 opened 2 months ago
I just found this randomly:
For developers, the process of adding RAG features to your app is basically: EMBEDDINGS
- choose your vector database; pg_vector, chromadb, several others
- chunk your document into paragraphs or sentences
- send each chunk through an embedding model locally or use a service like OpenAI
- store the embeddings and chunks in your db CHAT
- a user creates a prompt
- you generate an embedding for this prompt
- you do a cosine similarity search to find the most relevant documents to your prompt embeddings
- you receive the top results in a response and send the original prompt + the relavent documents to the model
- the model returns with a response with additional context you provided
RAG stands for Retrieval-Augmented Generation
There are ways to train an LLM using documents. We might be able to do this with documents in the club's public resources repo