OSU-NLP-Group / HippoRAG

HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personalized PageRank.
https://arxiv.org/abs/2405.14831
MIT License
1.23k stars 100 forks source link

relationships #38

Closed Reza-Ardestani closed 2 months ago

Reza-Ardestani commented 2 months ago

In retrieval of the most relevant documents, you use the knowledge graph ( with noun_phrases as nodes and E + Ep edges) and you use P matrix ( noun_phrases occurrences in passages) and Embeddings of noun_phrases.

The retrieval process extracts entities from query and map them to entities in the KG and run PPR (personalized PageRank) to adjust the probabilities of the most relevant nodes of the KG that are important for answering the query. Then, you use this adjusted probability vector and multiply it to P matrix to rank the passages for passing to the LLM for drafting the final answer to the query from fetched passages.

The question is, (1) why didn't you use relationships in triples in constructing the KG so that later you can extract both entities and relationships from query and search through the graph with considering both entities and relationships?

(2) Is there any recent research paper that does (1)?

bernaljg commented 2 months ago

Hi,

Thank you for the insightful questions!

1) Using both relations and entities for searching the graph is definitely desirable, however, we did not obtain empirically strong results with the straightforward approach you described. How to adequately leverage this information is still very much an open question!

2) I haven't seen any research papers tackling this problem yet but I'm sure they'll be coming out soon.