More control over actions. (New methods: index, is_indexed)
Using qdrant_client vector database instead of chromadb (Because it helps with metadata filtering).
More control over search functionality.
Includes metadata related to context in search results.
Handles file deletions.
Single file can be indexed
Note: I have not included python indexing functionality yet as I want to go through this commit first.
Regrading this goal:
We should always have SOTA embedding. If a better local embedding model is found, we should automatically download and use it.
I think Chroma will always do this (is this true?) so we depend on Chroma.
I personally think it should be up to the user, which embedder they want to use. Using SOTA embedding by default means it's size will be large and RAM usage will be more (Just guessed, correct me If I'm wrong). User can decide if they want their embeddings to be English or Multi-Lingual, of small or large dimensions and of course size.
What's New?
index
,is_indexed
)qdrant_client
vector database instead ofchromadb
(Because it helps with metadata filtering).search
functionality.Note: I have not included python indexing functionality yet as I want to go through this commit first.
Regrading this goal:
I personally think it should be up to the user, which embedder they want to use. Using SOTA embedding by default means it's size will be large and RAM usage will be more (Just guessed, correct me If I'm wrong). User can decide if they want their embeddings to be English or Multi-Lingual, of small or large dimensions and of course size.