Deploying RAG Vector Search in Offline/Private Environments: A Solution Inquiry

kingke0620 commented 5 months ago

Self Checks

[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] Pleas do not modify this template :) and fill in all the required fields.

Provide a description of the help you need

I want to deploy in a private domain environment, but it seems that the vector search of RAG requires an internet connection. How can I address this issue?

dosubot[bot] commented 5 months ago

Hey there, @kingke0620! Great to see you back. Hope you're doing well.

To deploy the vector search of RAG in a private domain environment without an internet connection, you would need to follow these steps:

Prepare Your Knowledge Base: Compile and organize the knowledge base that the RAG system will query. This knowledge base should be relevant to the domain and questions you expect the system to handle. Since you're operating in a private domain without internet access, this knowledge base must be stored locally or within your private network.
Vectorize the Knowledge Base: The knowledge base needs to be vectorized, meaning each piece of knowledge (e.g., articles, documents) is converted into a vector representation that captures its semantic meaning. This is typically done using models like BERT or other transformer-based models. The vectorization process can be performed offline and stored. You can refer to the JinaTextEmbeddingModel in the codebase for more details on how to vectorize texts using transformer-based models like BERT.
Set Up a Vector Database: Once your knowledge base is vectorized, you need to store these vectors in a database that supports vector search. Examples of such databases include Elasticsearch with the vector scoring plugin, FAISS, or Milvus. This database will facilitate the semantic similarity matching between the query vector and the knowledge base vectors.
Integrate RAG with the Vector Database: Modify the RAG system to query your local vector database instead of an external source. This involves adjusting the RAG's retrieval component to perform vector searches within your private vector database and fetch the relevant knowledge vectors based on the input query.
Deploy the RAG System: With the knowledge base vectorized, stored, and accessible through a vector database, and the RAG system configured to use this local setup, you can deploy the RAG system within your private domain environment. Ensure that the system has access to the necessary computational resources to handle the vector search and generation tasks.
Testing and Optimization: After deployment, thoroughly test the system with a variety of queries to ensure it retrieves relevant information and generates accurate responses. Based on the results, you may need to fine-tune the vectorization process, the vector search parameters, or the RAG model itself to improve performance.

For more details on the RAG system and its features, you can refer to the documentation.

Regarding the vectorization method used in the RAG system, it is determined by the vector_type variable, which is set based on the configuration or the dataset's index_struct_dict. The system supports three types of vector stores: Weaviate, Qdrant, and Milvus. The specific vectorization method used at runtime depends on the configuration specified in the VECTOR_STORE setting or the dataset's index_struct_dict['type']. You can refer to the vector_factory.py in the codebase for more details.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

takatost commented 5 months ago

This requires your embedding model to be deployed locally.

langgenius / dify