infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
23.51k stars 2.3k forks source link

[Feature Request]: File Management and Knowledge Base #2836

Open train147369 opened 1 month ago

train147369 commented 1 month ago

Is there an existing issue for the same feature request?

Is your feature request related to a problem?

Knowledge Base functionality, duplicate parsing of the same document in different knowledge bases from File Management

Describe the feature you'd like

The way I see it, File Management uploads and parses the files, Knowledge Base links to the files inside File Management via tags, which reduces duplicate parsing of files, and calls Knowledge Base when called.

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

KevinHuSh commented 1 month ago

I don't get it. This function has already full filled.

train147369 commented 1 month ago

I don't get it. This function has already full filled.

Uploading files through File Management and then associating them to multiple different knowledge bases requires each knowledge base to be parsed separately, which can reduce the number of times the files are parsed if they are parsed in file management and then associated.

netandreus commented 3 weeks ago

@KevinHuSh how can we add single document to multiple knowledge bases and avoid duplication of file on the storage and chunks in the Elasticsearch? This is the question.

KevinHuSh commented 3 weeks ago

Clear. Duplication in storage will not happen. But, duplicated chunks in ES, I'm afraid there're for now.