run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.78k stars 5.27k forks source link

[Question]: when include_text=True, the result of retriever is chunk document, and no graph result? Is the data recalled through similarity between query and chunk embedding?why embeddeding results not in nebula? #16281

Open abc-w opened 1 month ago

abc-w commented 1 month ago

Question Validation

Question

from llama_index.core.indices.property_graph import PropertyGraphIndex index = PropertyGraphIndex.from_documents( llama_index_docs, property_graph_store=graph_store, vector_store=vec_store, show_progress=True, embed_kg_nodes=True, use_async = False ) nodes = index.as_retriever(include_text=True).retrieve("What is raptor") when include_text=True, the result of retriever is chunk document, and no graph result? Is the data recalled through similarity between query and chunk embedding?why embeddeding results not in nebula?

dosubot[bot] commented 1 month ago

When include_text=True, the result of retriever returns a chunk document and not a graph result because the retriever is designed to fetch the actual content of all the nodes that match the query, including those found through keyword matching, depth search, and top-k similarity. This is evident from the following code snippet:

# When include_text = True just get the actual content of all the nodes
# (Nodes with actual keyword match, Nodes which are found from the depth search and Nodes found from top_k similarity)
if self._include_text:
    keywords = self._extract_rel_text_keywords(
        rel_texts
    )  # rel_texts will have all the Triplets retrieved with respect to the Query
    nested_node_ids = [
        self._index_struct.search_node_by_keyword(keyword)
        for keyword in keywords
    ]
    node_ids = [_id for ids in nested_node_ids for _id in ids]
    for node_id in node_ids:
        chunk_indices_count[node_id] += 1

The data is recalled through similarity between the query and chunk embedding when the retriever mode is set to use embeddings or a hybrid of keywords and embeddings. This is shown in the following part of the code:

if (
    self._retriever_mode != KGRetrieverMode.KEYWORD
    and len(self._index_struct.embedding_dict) > 0
):
    query_embedding = self._embed_model.get_text_embedding(
        query_bundle.query_str
    )
    all_rel_texts = list(self._index_struct.embedding_dict.keys())

    rel_text_embeddings = [
        self._index_struct.embedding_dict[_id] for _id in all_rel_texts
    ]
    similarities, top_rel_texts = get_top_k_embeddings(
        query_embedding,
        rel_text_embeddings,
        similarity_top_k=self.similarity_top_k,
        embedding_ids=all_rel_texts,
    )
    logger.debug(
        f"Found the following rel_texts+query similarites: {similarities!s}"
    )
    logger.debug(f"Found the following top_k rel_texts: {rel_texts!s}")
    rel_texts.extend(top_rel_texts)

The embedding results are not in Nebula because the code does not mention any integration with Nebula for storing or retrieving embeddings. Instead, it uses an internal embedding_dict within the index_struct to manage embeddings [1][2][3].

To continue talking to Dosu, mention @dosu.

abc-w commented 1 month ago

query_engine = index.as_query_engine(include_text=False) response = query_engine.query("What is raptor") print(str(response)) this code how to retriever docment to llm and get answer

dosubot[bot] commented 1 month ago

The code query_engine = index.as_query_engine(include_text=False) retrieves the document to the LLM and gets an answer by querying the property graph and sending only the structured data (triplets) to the LLM, without including the original text from which the triplets were extracted. The response is then generated based on these triplets.

Here is the relevant code snippet:

query_engine = index.as_query_engine(
    include_text=False, response_mode="tree_summarize"
)
response = query_engine.query(
    "What is raptor",
)
print(str(response))

This setup ensures that the LLM processes the structured data (triplets) to generate a summarized response. The querying process involves three stages: retrieval, postprocessing, and response synthesis. The property graph can be stored in Amazon Neptune, which supports both Neptune Database and Neptune Analytics for scalable and efficient graph data management [1][2][3].

To continue talking to Dosu, mention @dosu.

abc-w commented 1 month ago

how to set conbine structured data (triplets) and including the original text from which the triplets were extracted, And after answering, print out the recalled content

dosubot[bot] commented 1 month ago

To combine structured data (triplets) with the original text from which the triplets were extracted and print out the recalled content after answering, follow these steps:

  1. Install necessary packages:

    %pip install llama-index
    %pip install llama-index-llms-openai
    %pip install llama-index-graph-stores-kuzu
    %pip install pyvis
  2. Set up your OpenAI API key:

    import os
    os.environ["OPENAI_API_KEY"] = "API_KEY_HERE"
  3. Prepare for Kùzu:

    import shutil
    shutil.rmtree("./test1", ignore_errors=True)
    shutil.rmtree("./test2", ignore_errors=True)
    shutil.rmtree("./test3", ignore_errors=True)
  4. Initialize the Kùzu database:

    import kuzu
    db = kuzu.Database("test1")
  5. Initialize the KuzuGraphStore:

    from llama_index.graph_stores.kuzu import KuzuGraphStore
    graph_store = KuzuGraphStore(db)
  6. Load documents and define the LLM:

    from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex
    from llama_index.llms.openai import OpenAI
    from llama_index.core import Settings
    from IPython.display import Markdown, display
    
    documents = SimpleDirectoryReader("../../../examples/data/paul_graham").load_data()
    
    llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
    Settings.llm = llm
    Settings.chunk_size = 512
  7. Build the Knowledge Graph:

    from llama_index.core import StorageContext
    storage_context = StorageContext.from_defaults(graph_store=graph_store)
    
    index = KnowledgeGraphIndex.from_documents(
        documents,
        max_triplets_per_chunk=2,
        storage_context=storage_context,
    )
  8. Query the Knowledge Graph and include the original text:

    query_engine = index.as_query_engine(
        include_text=True, response_mode="tree_summarize"
    )
    response = query_engine.query("Tell me more about Interleaf")
    display(Markdown(f"<b>{response}</b>"))

This setup will allow you to query the knowledge graph, include the original text from which the triplets were extracted, and print out the recalled content after answering [1][2][3].

To continue talking to Dosu, mention @dosu.

abc-w commented 1 month ago

what is the meaning of max_triplets_per_chunk=2