infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
18.45k stars 1.87k forks source link

[Question]: How to improve RAG Accuracy with RAGFlow? #1337

Open BennisonDevadoss opened 3 months ago

BennisonDevadoss commented 3 months ago

Describe your problem

I've been using RAGFlow with the RAG system for the past few months, and I have a couple of questions based on my usage so far.

Question 1: When querying a database that stores document embeddings (e.g., Elasticsearch), retrieving specific information can be challenging if the query terms do not explicitly match the document keywords. For instance, searching a resume for a candidate's name might fail if the resume does not explicitly contain terms like 'candidate' or 'name'. The challenge here is how to extract relevant information from the vector database in such cases.

Example Scenario:

In such scenarios, how can we improve RAGFlow's accuracy?

Question 2: Does RAGFlow store documents in both Elasticsearch and Minio? If so, why is it necessary to store user-uploaded files in both systems?


KevinHuSh commented 3 months ago

A resume is actually a piece of structured data though it looks like a bunch of unstructured text. So, try the demo. It apply a resume parser to turn it to structured data which will be retrievaled by SQL. The SQL is transformed from user's question by LLM.