run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.41k stars 4.99k forks source link

[BUG]: the KG BUG, why kg_schema, not the kg_rel_texts? #15686

Open BunnyRivennn opened 2 weeks ago

BunnyRivennn commented 2 weeks ago

Question Validation

Question

kg_query_engine = kg_index.as_query_engine(similarity_top_k=15, retriever_mode='embedding', verbose=True, max_keywords_per_query=3, include_text=False,
graph_store_query_depth=5, response_mode="simple_summarize", ) response = query_engine.query("my questions................") i am using the function to query the kg, Why always kg_schema, so the accuracy is bad, i just want the kg_rel_texts.So how to do it? i can't find the way. maybe the BUG? if i change the code : response_str = self.get_response( query_str=query.query_str, text_chunks=[ n.node.get_content(metadata_mode=MetadataMode.LLM) for n in nodes ], response_kwargs, ) to -------------> response_str = self.get_response( query_str=query.query_str, text_chunks=[ n.node.get_content() for n in nodes ], response_kwargs, ) 【the code in response_synthesizers/base.py】 response = query_engine.query("my questions................") is the kg_rel_texts, so is the BUG?

dosubot[bot] commented 2 weeks ago

Hello, @BunnyRivennn! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together while we wait for a human maintainer.

The kg_query_engine defaults to using kg_schema because the KGTableRetriever class is configured to prioritize schema-based retrieval. To use kg_rel_texts instead, you need to adjust the retriever_mode to KGRetrieverMode.EMBEDDING or KGRetrieverMode.HYBRID. Here's how you can modify the initialization:

retriever = KGTableRetriever(
    index=your_index,
    query_keyword_extract_template=your_query_keyword_extract_template,
    max_keywords_per_query=10,
    num_chunks_per_query=10,
    include_text=True,
    retriever_mode=KGRetrieverMode.EMBEDDING,  # Change this to EMBEDDING or HYBRID
    similarity_top_k=2,
    graph_store_query_depth=2,
    use_global_node_triplets=False,
    max_knowledge_sequence=30,
    callback_manager=your_callback_manager,
    object_map=your_object_map,
    verbose=True
)

By setting retriever_mode to KGRetrieverMode.EMBEDDING or KGRetrieverMode.HYBRID, the retriever will use embeddings to find similar triplets, which includes kg_rel_texts, improving accuracy [1][2].

To continue talking to Dosu, mention @dosu.

BunnyRivennn commented 2 weeks ago

how to get the KGRetrieverMode?

dosubot[bot] commented 2 weeks ago

To get the KGRetrieverMode, you can use the KGRetrieverMode enum class defined in the llama_index.core.indices.knowledge_graph.retrievers module. Here is the relevant code snippet:

from llama_index.core.indices.knowledge_graph.retrievers import KGRetrieverMode

# Example usage
mode = KGRetrieverMode.KEYWORD
print(mode)  # Output: keyword

The KGRetrieverMode enum has the following modes:

These modes map to the KGTableRetriever class in the Knowledge Graph Index [1][2][3].

To continue talking to Dosu, mention @dosu.

BunnyRivennn commented 2 weeks ago

from llama_index.core.indices.knowledge_graph.retrievers import KGRetrieverMode mode_emb = KGRetrieverMode.EMBEDDING

kg_query_engine = kg_index.as_query_engine(similarity_top_k=15, retriever_mode=mode_emb , verbose=True, max_keywords_per_query=3, include_text=False, graph_store_query_depth=5, response_mode="simple_summarize", ) response = query_engine.query("my questions................") It's still kg_schema, not the kg_rel_texts.

maybe the BUG?

dosubot[bot] commented 2 weeks ago

It seems that setting retriever_mode to KGRetrieverMode.EMBEDDING in kg_query_engine still results in using kg_schema because the _retrieve method includes logic that always attempts to use the graph schema if it is available. Specifically, the code block:

if self._graph_schema != "":
    rel_node_info["kg_schema"] = {"schema": self._graph_schema}

This ensures that the kg_schema is included in the rel_node_info if it is not empty, regardless of the retriever_mode setting [1][2].

This might be a bug or an unintended behavior. I've looped in a maintainer to help further investigate this issue. Hang tight!

To continue talking to Dosu, mention @dosu.