ivanpetrushev / easy-rag

Playground for LLMs, RAG, agents and tool use
0 stars 0 forks source link

RAG demo

Introduction

The goal of this demo app is to showcase the quickest and easiest way to build a usable knowledge database using Retrieval-Augmented Generation (RAG). RAG enhances the ability to answer questions by combining retrieval and generation capabilities.

Use case: seed knowledge with maintenance manuals, ask questions for troubleshooting issues. Provided is an example PDF file (t60.pdf - maintenance manual for IBM T60 laptops).

Features

Prerequisites

Installation

  1. Clone the repository
  2. Copy the .env.example file to .env and fill in the required values

Usage

Build the Docker image:

docker build -t local-rag:latest image/

Run the Docker container:

docker run -it --env-file .env local-rag:latest python3 pdfloader.py

This will:

Results

Some example questions and generated answers can be found in the results/ directory.

Improvement points

Tons of them, but here are a few:

Costs

Using AWS Bedrock will incur some costs.

With the current setup (example t60.pdf file of ~200 pages mixed text and images, settings for chunking text) the whole corpus is about 200 000 tokens. Running it through the amazon.titan-embed-text-v2:0 model will cost about $0.004.

Asking questions and sending prompts to anthropic.claude-3-haiku-20240307-v1:0 results in minimal costs for input/output model tokens. For rough estimate - everything in the results/ direcory was generated for about $0.005.

Keep in mind those are one of the cheapest models, so probably a model switch will increase costs.