infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
19.88k stars 1.98k forks source link

[Question]: Define meta data for each chunks #1624

Open erikguo opened 3 months ago

erikguo commented 3 months ago

Describe your problem

Thank you for your excellent work!

We wanna use this Ragflow as our major knowledge base. Our documents have several pattern of structures. Each part in the structure has coherent contents. So we decide to split documents to chunks according to the structure. How can we mark meta data in each chunk? we only find keywords attribute of chunks. This isn't suitable for marking meta data.

KevinHuSh commented 3 months ago

Meta data to chunk has not been supported yet. You could contact us by yingfeng.zhang@infiniflow.org

erikguo commented 3 months ago

Thank you for your quick reply.

Another question: where do we get the sequence and page no of each chunk in the source document? We didn't found in the code and information in the chunk.

KevinHuSh commented 2 months ago

It's stored in ES about fields of position.