janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
https://jan.ai/
GNU Affero General Public License v3.0
23.44k stars 1.37k forks source link

idea: Allow RAG with .txt files #3779

Open 4722794 opened 1 month ago

4722794 commented 1 month ago

Problem Statement

Hi, am I missing something?

The current tool only supports pdfs. Why not .txt files?

Feature Idea

Add support for pdfs and .txt files for RAG in tools.

4722794 commented 1 month ago

Also, I did go through the documentation here: https://jan.ai/docs/tools/retrieval#enable-the-knowledge-retrieval

But I just don't get it. On my system, there is no provision to select a model. I was hoping to select some embedding model like text-embedding-3.

Screenshot 2024-10-10 at 8 20 34 PM

And where is it storing the chunks? Is it creating a vector database somewhere?

0xSage commented 1 month ago

Apologies, txt rag is not supported yet 🙏

4722794 commented 1 month ago

@0xSage no problem; Jan is terrific, thanks for putting it together.

But I've found the RAG implementation in Jan a little difficult to understand;

I've gone through the docs and it's still not clear;

When I upload a file, are you uploading the file to my openai account and using a vector database from there?

Or is it created locally using chromadb?

imtuyethan commented 3 weeks ago

We should close this ticket & make it a part of https://github.com/janhq/cortex.cpp/issues/1595