microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
19.15k stars 1.89k forks source link

[Bug]: <title>Summary of the empty description list #1097

Open 0XUPT0thief opened 2 months ago

0XUPT0thief commented 2 months ago

Do you need to file an issue?

Describe the bug

graphrag/index/graph/extractors/summarize/description_summary_extractor.py line 68~73 It seems to summarize the empty description list.

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

natoverse commented 2 months ago

I think what you've identified is that line 70 should be an elif, otherwise a length 0 description list would end up in the empty string result being reassigned to a summary.

The logic should be:

  1. If there is no description, the result is an empty string. This means that downstream summarization, such as the community reports, will not include information about the entity.
  2. If the list of descriptions only contains one item, we just use that as the final description.
  3. If the list is actually a list, we ask the LLM to summarize them so we have a single final description.