danny-avila / rag_api

ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector
https://librechat.ai/
143 stars 60 forks source link
api api-rest embeddings fastapi langchain pgvector postgresql psql python rag vector vector-database

# ID-based RAG FastAPI

Overview

This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.

Files are organized into embeddings by file_id. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.

The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.

The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.

Features

Setup

Getting Started

Environment Variables

The following environment variables are required to run the application:

Make sure to set these environment variables before running the application. You can set them in a .env file or as system environment variables.

Use Atlas MongoDB as Vector Database

Instead of using the default pgvector, we could use Atlas MongoDB as the vector database. To do so, set the following environment variables

VECTOR_DB_TYPE=atlas-mongo
ATLAS_MONGO_DB_URI=<mongodb+srv://...>
MONGO_VECTOR_COLLECTION=<collection name>

The ATLAS_MONGO_DB_URI could be the same or different from what is used by LibreChat. Even if it is the same, the $MONGO_VECTOR_COLLECTION collection needs to be a completely new one, separate from all collections used by LibreChat. In additional, create a vector search index for $MONGO_VECTOR_COLLECTION with the following json:

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    },
    {
      "path": "file_id",
      "type": "filter"
    }
  ]
}

Follw one of the four documented methods to create the vector index.

Cloud Installation Settings:

AWS:

Make sure your RDS Postgres instance adheres to this requirement:

The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.

In order to setup RDS Postgres with RAG API, you can follow these steps:

Notes:

Dev notes:

Installing pre-commit formatter

Run the following commands to install pre-commit formatter, which uses black code formatter:

pip install pre-commit
pre-commit install