LucknowAI / Rag-LLM

Retrieval-Augmented Generation (RAG) using Large Language Models (LLMs)
3 stars 5 forks source link

Load External Data to Vector Database #2

Open ASahu16 opened 7 months ago

ASahu16 commented 7 months ago

Description: Implement functionality to load external data into the vector database. This involves developing scripts or tools to import data from various sources such as DOCX or PDF files and store them in the vector database.

Tasks:

aarushiksk commented 7 months ago

The steps that can be taken to solve this are:

Step 1) Parsing the PDF/DOCX using PyMuPDF(for text) or OCR(for images) or similar python libraries. Step2) Choosing an embedding model for converting this to embeddings. Step 3) Connecting to ChromaDB or FAISS using their APIs/Documentation

Assign this to me