RAG0003: Embed text chunks and populate vector database (2)

Description

Using the "Alibaba-NLP/gte-large-en-v1.5" embedding model, the text chunks will be vectorized and inserted into the ChromaDB vector database. The stored embeddings will be indexed and enriched with the metadata information.

Expected Output

Vector database that includes all the embedded text chunks with their metadata.

Implementation Plan

Implementation tasks

[x] setup standalone ChromaDB instance
[x] integrate with Llama Index
[x] embed text chunks
[x] populate and index vectors into DB

OpenPecha / rag_prep_tool

RAG0003: Embed text chunks and populate vector database (2) #4

Description

Expected Output

Implementation Plan

Implementation tasks