microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
20.11k stars 1.97k forks source link

[Issue]: <Entity Extraction Question> #729

Closed Bai1026 closed 3 months ago

Bai1026 commented 4 months ago

Is there an existing issue for this?

Describe the issue

Actually nothing is wrong, but while I was testing the graphRAG with my own chat log dataset, it could successfully answer the question I want (e.g. give some most used word of Vincent in the chat log).

And then I go to check the graphml and parquet files, I could not find these entities, these entities is like some specific word (something like damn, oops... but in Chinese).

I am wondering how is it possible if the entities is not extracted but the local search could answer the questions successfully.

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

natoverse commented 4 months ago

In the response, the LLM should print citations for the entities and relationships used (e.g., "Entities(id1,id2,...)"). You can look up those entities in the create_final_entities.parquet and match on the human readable id field.

So: can you check your outputs and see if this aligns? If you have an entity id cited that is not in the parquet, this may be a hallucination example.

Otherwise: We try with our prompts to force the LLM to only rely on the supplied entity list to answer the question, but it is possible that it is drawing on its training to answer the question.

You also mention that the language is Chinese - we are tracking non-English with a consolidated issue; folks have various comments on how to get better results: https://github.com/microsoft/graphrag/issues/696

github-actions[bot] commented 4 months ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

github-actions[bot] commented 3 months ago

This issue has been closed after being marked as stale for five days. Please reopen if needed.