microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
19.24k stars 1.9k forks source link

[Feature Request]: Adding an optional layer in order to de-duplicate entities whenever possible. #1223

Closed ruplisso closed 1 month ago

ruplisso commented 1 month ago

Do you need to file an issue?

Is your feature request related to a problem? Please describe.

The current Entities extraction method lead to a lot of duplicates on some documentation.

Describe the solution you'd like

An additional step of consolidation could be useful in order to remove duplicate when confidence is high enough.

Additional context

No response

natoverse commented 1 month ago

Duplicate of #113