run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.36k stars 4.67k forks source link

[Bug]: KnowledgeGraphQueryEngine fails with AttributeError #11034

Open eercanayar opened 4 months ago

eercanayar commented 4 months ago

Bug Description

Both docs/examples/query_engine/knowledge_graph_query_engine.ipynb and docs/examples/query_engine/knowledge_graph_rag_query_engine.ipynb examples are failing due to following error: AttributeError: 'NoneType' object has no attribute 'kwargs' on KnowledgeGraphQueryEngine().query().

Version

llama-index==0.10.7

Steps to Reproduce

First occurrence:

  1. Start running the notebook: docs/examples/query_engine/knowledge_graph_query_engine.ipynb
  2. You will experience AttributeError: 'NoneType' object has no attribute 'kwargs' when running KnowledgeGraphQueryEngine().query() with llama-index==0.10.7.

Second occurrence:

  1. Start running the notebook: docs/examples/query_engine/knowledge_graph_rag_query_engine.ipynb
  2. You will experience WARNING:llama_index.core.indices.knowledge_graph.retrievers:Error in retrieving from nl2graphquery: 'NoneType' object has no attribute 'kwargs' when running query_engine_with_nl2graphquery.query() with llama-index==0.10.7.

Relevant Logs/Tracbacks

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[28], line 1
----> 1 response = query_engine.query(
      2     "Tell me about Peter Quill?",
      3 )
      4 display(Markdown(f"<b>{response}</b>"))

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py:40, in BaseQueryEngine.query(self, str_or_query_bundle)
     38 if isinstance(str_or_query_bundle, str):
     39     str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 40 return self._query(str_or_query_bundle)

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/query_engine/knowledge_graph_query_engine.py:199, in KnowledgeGraphQueryEngine._query(self, query_bundle)
    195 """Query the graph store."""
    196 with self.callback_manager.event(
    197     CBEventType.QUERY, payload={EventPayload.QUERY_STR: query_bundle.query_str}
    198 ) as query_event:
--> 199     nodes: List[NodeWithScore] = self._retrieve(query_bundle)
    201     response = self._response_synthesizer.synthesize(
    202         query=query_bundle,
    203         nodes=nodes,
    204     )
    206     if self._verbose:

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/query_engine/knowledge_graph_query_engine.py:154, in KnowledgeGraphQueryEngine._retrieve(self, query_bundle)
    152 def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
    153     """Get nodes for response."""
--> 154     graph_store_query = self.generate_query(query_bundle.query_str)
    155     if self._verbose:
    156         print_text(f"Graph Store Query:\n{graph_store_query}\n", color="yellow")

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/query_engine/knowledge_graph_query_engine.py:132, in KnowledgeGraphQueryEngine.generate_query(self, query_str)
    129 """Generate a Graph Store Query from a query bundle."""
    130 # Get the query engine query string
--> 132 graph_store_query: str = self._llm.predict(
    133     self._graph_query_synthesis_prompt,
    134     query_str=query_str,
    135     schema=self._graph_schema,
    136 )
    138 return graph_store_query

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/llms/llm.py:249, in LLM.predict(self, prompt, **prompt_args)
    243 def predict(
    244     self,
    245     prompt: BasePromptTemplate,
    246     **prompt_args: Any,
    247 ) -> str:
    248     """Predict."""
--> 249     self._log_template_data(prompt, **prompt_args)
    251     if self.metadata.is_chat_model:
    252         messages = self._get_messages(prompt, **prompt_args)

File ~/Experimental/jupyter-ws/.venv/lib/python3.11/site-packages/llama_index/core/llms/llm.py:170, in LLM._log_template_data(self, prompt, **prompt_args)
    165 def _log_template_data(
    166     self, prompt: BasePromptTemplate, **prompt_args: Any
    167 ) -> None:
    168     template_vars = {
    169         k: v
--> 170         for k, v in ChainMap(prompt.kwargs, prompt_args).items()
    171         if k in prompt.template_vars
    172     }
    173     with self.callback_manager.event(
    174         CBEventType.TEMPLATING,
    175         payload={
   (...)
    180         },
    181     ):
    182         pass

AttributeError: 'NoneType' object has no attribute 'kwargs'
dosubot[bot] commented 4 months ago

🤖

Hey @eercanayar, great to see you back! Hope you've been well.

Thank you for reporting this issue. It seems like the error is originating from the LLM.predict() method, specifically when it tries to log the template data. The error message AttributeError: 'NoneType' object has no attribute 'kwargs' suggests that the prompt object passed to the predict() method is None.

The predict() method is part of the LLMPredictor class and is used to predict the answer to a query. It takes a prompt of type BasePromptTemplate and optional output_cls of type BaseModel along with other arguments. The method first logs the template data and then proceeds to generate a response based on whether the LLM is a chat model or not.

The prompt object is expected to have a kwargs attribute, which is used in the _log_template_data() method. If the prompt object is None, it would not have the kwargs attribute, leading to the AttributeError you're seeing.

The KnowledgeGraphQueryEngine()._query() method, which is where the LLM.predict() method is called, is used to query the graph store. It retrieves nodes for the response and synthesizes a response from the retrieved nodes. The prompt object is likely generated in the generate_query() method of the KnowledgeGraphQueryEngine class.

Given this, there are a few things that could be causing this issue:

  1. The prompt object is not being correctly generated in the generate_query() method.
  2. The prompt object is being overwritten or set to None somewhere before the predict() method is called.
  3. There is a bug in the LLM.predict() method that is causing the prompt object to be None.

To further investigate this issue, you could add some debug logs in the generate_query() method and before the predict() method is called to check the value of the prompt object. This could help identify where the prompt object is being set to None.

I hope this helps in identifying the issue. If you have any further questions or need more assistance, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
logan-markewich commented 4 months ago

this error implies that the prompt being sent to the LLM is None --- which feels kind of impossible? I will try running the notebook and report back

logan-markewich commented 4 months ago

Ah, this is an issue speicfically with text -> cypher (or text -> kg query I should say)

AnujLahoty commented 4 months ago

@logan-markewich Is this issue currently been looked upon?

logan-markewich commented 4 months ago

@AnujLahoty I started looking into it, and the fix is a tad complex. I don't have the bandwidth to prioritize it right now.

Using llama_index.legacy imports though should be a work around

AnujLahoty commented 4 months ago

@logan-markewich I tried it with the below code. Now it says no openAI key found butI have configured my local LLM and it is working fine with other strategies.

import logging
import sys
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
from llama_index.core.settings import Settings
from llama_index.embeddings.langchain import LangchainEmbedding
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.legacy.query_engine import KnowledgeGraphQueryEngine

llm = HuggingFaceLLM(model_name="mistralai/Mistral-7B-Instruct-v0.2", tokenizer_name="mistralai/Mistral-7B-Instruct-v0.2", device_map="auto", generate_kwargs={"temperature": 0.001, "do_sample":True}, 
    max_new_tokens=1024)

embed_model = LangchainEmbedding(
  HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
)

Settings.llm = llm
Settings.embed_model = embed_model
cypher_query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    llm=llm,
    verbose=True,
)
logan-markewich commented 4 months ago

Looks right to me. A full traceback might help

viperby commented 4 months ago

when I tried to use data from Neo4j, I have encountered a similar problem when applying the KnowledgeGraphQueryEngine method.the code looks like this:

from llama_index.graph_stores.neo4j import Neo4jGraphStore
from llama_index.llms.openai import OpenAI
from llama_index.core.query_engine import KnowledgeGraphQueryEngine
llm_model=OpenAI(temperature=0, model="gpt-3.5-turbo")
username = "neo4j"
password = "my password"
url = "bolt://ip:host"
database = "neo4j"

graph_store = Neo4jGraphStore(
    username=username,
    password=password,
    url=url,
    database=database,
)

storage_context = StorageContext.from_defaults(graph_store=graph_store)
query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    llm=llm_model,
    verbose=True,
)
response = query_engine.query("some of my questions")

The error prompted by the command line is as follows:

 in _log_template_data
    for k, v in ChainMap(prompt.kwargs, prompt_args).items()
AttributeError: 'NoneType' object has no attribute 'kwargs'
Exception ignored in: <function Driver.__del__ at 0x0000022994D98CA0>

LlamaIndex = v0.10.15

AnujLahoty commented 4 months ago

@logan-markewich Here is the full stack trace:

ValueError Traceback (most recent call last) File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/llms/utils.py:29, in resolve_llm(llm) 28 llm = OpenAI() ---> 29 validate_openai_api_key(llm.api_key) 30 except ValueError as e:

File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/llms/openai_utils.py:383, in validate_openai_api_key(api_key) 382 if not openai_api_key: --> 383 raise ValueError(MISSING_API_KEY_ERROR_MESSAGE)

ValueError: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last) Cell In[27], line 1 ----> 1 cypher_query_engine = KnowledgeGraphQueryEngine( 2 storage_context=storage_context, 3 llm=llm, 4 verbose=True, 5 )

File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/query_engine/knowledge_graph_query_engine.py:144, in KnowledgeGraphQueryEngine.init(self, service_context, storage_context, graph_query_synthesis_prompt, graph_response_answer_prompt, refresh_schema, verbose, response_synthesizer, **kwargs) 141 self._storage_context = storage_context 142 self.graph_store = storage_context.graph_store --> 144 self._service_context = service_context or ServiceContext.from_defaults() 146 # Get Graph Store Type 147 self._graph_store_type = GRAPH_STORE_CLASS_TO_GRAPH_STORE_TYPE[ 148 self.graph_store.class 149 ]

File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/service_context.py:178, in ServiceContext.from_defaults(cls, llm_predictor, llm, prompt_helper, embed_model, node_parser, text_splitter, transformations, llama_logger, callback_manager, system_prompt, query_wrapper_prompt, pydantic_program_mode, chunk_size, chunk_overlap, context_window, num_output, chunk_size_limit) 176 if llm_predictor is not None: 177 print("LLMPredictor is deprecated, please use LLM instead.") --> 178 llm_predictor = llm_predictor or LLMPredictor( 179 llm=llm, pydantic_program_mode=pydantic_program_mode 180 ) 181 if isinstance(llm_predictor, LLMPredictor): 182 llm_predictor.llm.callback_manager = callback_manager

File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/llm_predictor/base.py:109, in LLMPredictor.init(self, llm, callback_manager, system_prompt, query_wrapper_prompt, pydantic_program_mode) 100 def init( 101 self, 102 llm: Optional[LLMType] = "default", (...) 106 pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT, 107 ) -> None: 108 """Initialize params.""" --> 109 self._llm = resolve_llm(llm) 111 if callback_manager: 112 self._llm.callback_manager = callback_manager

File ~/miniconda3/envs/tab_data/lib/python3.11/site-packages/llama_index/legacy/llms/utils.py:31, in resolve_llm(llm) 29 validate_openai_api_key(llm.api_key) 30 except ValueError as e: ---> 31 raise ValueError( 32 "\n**\n" 33 "Could not load OpenAI model. " 34 "If you intended to use OpenAI, please check your OPENAI_API_KEY.\n" 35 "Original error:\n" 36 f"{e!s}" 37 "\nTo disable the LLM entirely, set llm=None." 38 "\n**" 39 ) 41 if isinstance(llm, str): 42 splits = llm.split(":", 1)

ValueError:


Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY. Original error: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys

To disable the LLM entirely, set llm=None.


logan-markewich commented 4 months ago

Oh I think one issue is that you can't mix legacy imports with non-legacy imports. Likely that's causing the llm to not resolve correctly.

Should be from llama_index.core.query_engine import KnowledgeGraphQueryEngine I think

IliyanGochev commented 4 months ago

If you can't work with the legacy imports, you can try to define your own graph_query_synthesis_prompt, based on the legacy prompts defined in the legacy knowledge graph package (llama_index\legacy\query_engine\knowledge_graph_query_engine.py) DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT or DEFAULT_NEO4J_NL2CYPHER_PROMPT.

If you're not using OpenAI, you'd definitely need to override it to suite your LLM. I'm trying to make it work with Ollama & Mistral right now.

NeroAziz commented 3 months ago

Hi all, we have the same issue! Have we got anywhere with is?

acriptis commented 3 months ago

I've stucked with the same error when was reproducing this doc page: https://docs.llamaindex.ai/en/stable/examples/query_engine/knowledge_graph_query_engine.html

hebertrfreitas commented 3 months ago

I also face the same error, following the same examples. Is there any old version off llama-index when this problem doesn't happen ?

hebertrfreitas commented 3 months ago

I digged through the code and found that the problem is that when we create a KnowledgeGraphQueryEngine the parameter _graph_query_synthesis_prompt remains None, and it is used here (as pointed by @IliyanGochev). So, In the legacy version of KnowledgeGraphQueryEngine we have an specific prompt for every type of graph stores(Nebula, Neo4J, etc). I suspect that this approach changed over time because now we don't have a way to identify the type of graph store in this class. Also I discovered that if we create and populate the index and use the method as_query_engine we are able to do the search, something like:

index = KnowledgeGraphIndex(
        nodes=nodes,
        storage_context=storage_context,
        max_triplets_per_chunk=2,
        include_embeddings=True
    )
query_engine = index.as_query_engine(include_text=False, response_mode="tree_summarize") 
query_engine.query("Your query")   

Of course the above approach don't work for use cases when first you create the index and at a different moment you search in the graph database(like mine). I will continue digging to identify a solution.

drm-addoptio commented 3 months ago

I have the same problem.

kuguadawang12138 commented 3 months ago

@hebertrfreitas I think you're right, and I'm having the same issue.Is there a solution?

pulkitmehtaworkmetacube commented 3 months ago

I am facing same issue and read whole conversation. I hardcoded DEFAULT_NEO4J_NL2CYPHER_PROMPT as suggested by IliyanGochev and it is generating graph query but after that it is struck and there is no response even after 5 minutes . Any idea if we need to make further changes in code .

adriacb commented 2 months ago

I had the same issue, notice that the DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT_TMPL is missing in the latest version of llama-index 0.10.29 so what I did is the following:

DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT_TMPL = """
Generate NebulaGraph query from natural language.
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
---
{schema}
---
Note: NebulaGraph speaks a dialect of Cypher, comparing to standard Cypher:

1. it uses double equals sign for comparison: `==` rather than `=`
2. it needs explicit label specification when referring to node properties, i.e.
v is a variable of a node, and we know its label is Foo, v.`foo`.name is correct
while v.name is not.

For example, see this diff between standard and NebulaGraph Cypher dialect:
```diff
< MATCH (p:person)-[:directed]->(m:movie) WHERE m.name = 'The Godfather'
< RETURN p.name;
---
> MATCH (p:`person`)-[:directed]->(m:`movie`) WHERE m.`movie`.`name` == 'The Godfather'
> RETURN p.`person`.`name`;

Question: {query_str}

NebulaGraph Cypher dialect query:
"""
DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT = PromptTemplate(
    DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT_TMPL,
    prompt_type=PromptType.TEXT_TO_GRAPH_QUERY,
)

query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    verbose=True,
    graph_query_synthesis_prompt=DEFAULT_NEBULAGRAPH_NL2CYPHER_PROMPT,
)

So basically I had to define the prompt manually. I grabbed it from here: https://github.com/run-llama/llama_index/blob/71919f9dfa09e9628af8b3a59d497ad02a7a82f8/llama_index/query_engine/knowledge_graph_query_engine.py#L24

mohamedhazem24 commented 2 months ago

I recently faced the same problem after some searching I found that I can run it like that from llama_index.core import KnowledgeGraphIndex

rebel_kg_index = KnowledgeGraphIndex.from_documents( documents, kg_triplet_extract_fn=extract_triplets, storage_context=storage_context, max_triplets_per_chunk=10, service_context=service_context, space_name=space_name, edge_types=edge_types, rel_prop_names=rel_prop_names, tags=tags, include_embeddings=True, ) qeng=rebel_kg_index.as_query_engine(include_text=True,response_mode='tree_summarize',embedding_mode='hybrid',simmilarity_top_k=10,verbose=True)

dew-maris-hugeman commented 2 months ago

My llama-index package is 0.10.31 and I try to pass the graph_query_synthesis_prompt argument and it worked.

Here is my code snippet.

from llama_index.legacy.query_engine.knowledge_graph_query_engine import DEFAULT_NEO4J_NL2CYPHER_PROMPT

query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    verbose=True,
    refresh_schema=True,
    graph_query_synthesis_prompt=DEFAULT_NEO4J_NL2CYPHER_PROMPT
)
cqx931 commented 1 month ago

I tried to pass the graph_query_synthesis_prompt argument with DEFAULT_NEO4J_NL2CYPHER_PROMPT but got the following error:

Can't instantiate abstract class BasePromptTemplate with abstract methods format, format_messages, get_template, partial_format (type=type_error)