curiousily / ragbase

Completely local RAG. Chat with your PDF documents (with open LLM) and UI to that uses LangChain, Streamlit, Ollama (Llama 3.1), Qdrant and advanced methods like reranking and semantic chunking.
https://www.mlexpert.io/bootcamp/ragbase-local-rag
MIT License
55 stars 19 forks source link
langchain llama3 llm pdf rag retrieval-augmented-generation streamlit

RagBase - Private Chat with Your Documents

Completely local RAG with chat UI

Demo

Check out the RagBase on Streamlit Cloud. Runs with Groq API.

Installation

Clone the repo:

git clone git@github.com:curiousily/ragbase.git
cd ragbase

Install the dependencies (requires Poetry):

poetry install

Fetch your LLM (gemma2:9b by default):

ollama pull gemma2:9b

Run the Ollama server

ollama serve

Start RagBase:

poetry run streamlit run app.py

Architecture

Ingestor

Extracts text from PDF documents and creates chunks (using semantic and character splitter) that are stored in a vector databse

Retriever

Given a query, searches for similar documents, reranks the result and applies LLM chain filter before returning the response.

QA Chain

Combines the LLM with the retriever to answer a given user question

Tech Stack

Add Groq API Key (Optional)

You can also use the Groq API to replace the local LLM, for that you'll need a .env file with Groq API key:

GROQ_API_KEY=YOUR API KEY