langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.26k stars 14.73k forks source link

Raises ValueError: Missing some input keys: {'query'} everytime I invoke 'GraphCypherQAChain.from_llm' chain with query present as input keys #25476

Closed AkashBais closed 3 weeks ago

AkashBais commented 4 weeks ago

Checked other resources

Example Code

""" python CYPHER_GENERATION_TEMPLATE = '''Task:Generate Cypher statement to query a graph database. Instructions: Use only the provided relationship types and properties in the schema. Do not use any other relationship types or properties that are not provided. Schema: {schema} Note: Do not include any explanations or apologies in your responses. Do not respond to any questions that might ask anything else than for you to construct a Cypher statement. Do not include any text except the generated Cypher statement. Examples: Here are a few examples of generated Cypher statements for particular questions:

Which sections talk about medication? MATCH (n2:ner_entity)-[new_present_in:PRESENT_IN]->(n1:Chunk) WHERE n2.name = 'medication' RETURN DISTINCT n1.section as section_list

The question is: {query} '''

CYPHER_GENERATION_PROMPT = PromptTemplate( input_variables=['schema', 'query'], template=CYPHER_GENERATION_TEMPLATE )

QA_CHAIN = GraphCypherQAChain.from_llm( graph=return_graph(), # This is an internal method that returns a graph object llm=READER_LLM, verbose=True, cypher_prompt=CYPHER_GENERATION_PROMPT, ) QA_CHAIN.invoke( input = {'query': query}, return_only_outputs=True, ) """

query = 'Sample question?'

Error Message and Stack Trace (if applicable)

Entering new GraphCypherQAChain chain...

ValueError Traceback (most recent call last) in <cell line: 1>() ----> 1 responce = rag.query( 2 query = 'Which sections talk about medication?' 3 ) 4 # self.input_key

10 frames /content/RAG/KG_for_RAG/src/execute_rag.py in query(self, query) 336 337 if (self.query_type is not None) and (self.query_type.lower() in ['self_genrate','prompt']): --> 338 answer = self.QA_CHAIN.invoke( 339 input = {'query': query}, 340 return_only_outputs=True,

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in invoke(self, input, config, **kwargs) 164 except BaseException as e: 165 run_manager.on_chain_error(e) --> 166 raise e 167 run_manager.on_chain_end(outputs) 168

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in invoke(self, input, config, **kwargs) 154 self._validate_inputs(inputs) 155 outputs = ( --> 156 self._call(inputs, run_manager=run_manager) 157 if new_arg_supported 158 else self._call(inputs)

/usr/local/lib/python3.10/dist-packages/langchain_community/chains/graph_qa/cypher.py in _call(self, inputs, run_manager) 251 intermediate_steps: List = [] 252 --> 253 generated_cypher = self.cypher_generation_chain.run( 254 {"question": question, "schema": self.graph_schema}, callbacks=callbacks 255 )

/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py in warning_emitting_wrapper(*args, kwargs) 168 warned = True 169 emit_warning() --> 170 return wrapped(*args, *kwargs) 171 172 async def awarning_emitting_wrapper(args: Any, kwargs: Any) -> Any:

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in run(self, callbacks, tags, metadata, *args, **kwargs) 598 if len(args) != 1: 599 raise ValueError("run supports only one positional argument.") --> 600 return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[ 601 _output_key 602 ]

/usr/local/lib/python3.10/dist-packages/langchain_core/_api/deprecation.py in warning_emitting_wrapper(*args, kwargs) 168 warned = True 169 emit_warning() --> 170 return wrapped(*args, *kwargs) 171 172 async def awarning_emitting_wrapper(args: Any, kwargs: Any) -> Any:

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in call(self, inputs, return_only_outputs, callbacks, tags, metadata, run_name, include_run_info) 381 } 382 --> 383 return self.invoke( 384 inputs, 385 cast(RunnableConfig, {k: v for k, v in config.items() if v is not None}),

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in invoke(self, input, config, **kwargs) 164 except BaseException as e: 165 run_manager.on_chain_error(e) --> 166 raise e 167 run_manager.on_chain_end(outputs) 168

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in invoke(self, input, config, **kwargs) 152 ) 153 try: --> 154 self._validate_inputs(inputs) 155 outputs = ( 156 self._call(inputs, run_manager=run_manager)

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in _validate_inputs(self, inputs) 282 missing_keys = set(self.input_keys).difference(inputs) 283 if missing_keys: --> 284 raise ValueError(f"Missing some input keys: {missing_keys}") 285 286 def _validate_outputs(self, outputs: Dict[str, Any]) -> None:

ValueError: Missing some input keys: {'query'}

Description

As seen I am passing my input question with the dictionary key 'query' to the GraphCypherQAChain. It still keeps poping the error "Missing some input keys: {'query'} "

System Info

System Information

OS: Linux OS Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024 Python Version: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]

Package Information

langchain_core: 0.2.32 langchain: 0.2.11 langchain_community: 0.2.0 langsmith: 0.1.99 langchain_google_genai: 1.0.8 langchain_google_vertexai: 1.0.8 langchain_openai: 0.1.7 langchain_text_splitters: 0.2.2

Optional packages not installed

langgraph langserve

Other Dependencies

aiohttp: 3.10.2 aiosqlite: Installed. No version info available. aleph-alpha-client: Installed. No version info available. anthropic: Installed. No version info available. anthropic[vertexai]: Installed. No version info available. arxiv: Installed. No version info available. assemblyai: Installed. No version info available. async-timeout: 4.0.3 atlassian-python-api: Installed. No version info available. azure-ai-documentintelligence: Installed. No version info available. azure-identity: Installed. No version info available. azure-search-documents: Installed. No version info available. beautifulsoup4: 4.12.3 bibtexparser: Installed. No version info available. cassio: Installed. No version info available. chardet: 5.2.0 cloudpickle: 2.2.1 cohere: Installed. No version info available. databricks-vectorsearch: Installed. No version info available. dataclasses-json: 0.6.7 datasets: 2.20.0 dgml-utils: Installed. No version info available. elasticsearch: Installed. No version info available. esprima: Installed. No version info available. faiss-cpu: Installed. No version info available. feedparser: Installed. No version info available. fireworks-ai: Installed. No version info available. friendli-client: Installed. No version info available. geopandas: 0.14.4 gitpython: Installed. No version info available. google-cloud-aiplatform: 1.59.0 google-cloud-documentai: Installed. No version info available. google-cloud-storage: 2.18.2 google-generativeai: 0.7.2 gql: Installed. No version info available. gradientai: Installed. No version info available. hdbcli: Installed. No version info available. hologres-vector: Installed. No version info available. html2text: Installed. No version info available. httpx: 0.27.0 httpx-sse: Installed. No version info available. javelin-sdk: Installed. No version info available. jinja2: 3.1.4 jq: Installed. No version info available. jsonpatch: 1.33 jsonschema: 4.23.0 lxml: 4.9.4 markdownify: Installed. No version info available. motor: Installed. No version info available. msal: Installed. No version info available. mwparserfromhell: Installed. No version info available. mwxml: Installed. No version info available. newspaper3k: Installed. No version info available. numexpr: 2.10.1 numpy: 1.26.4 nvidia-riva-client: Installed. No version info available. oci: Installed. No version info available. openai: 1.40.8 openapi-pydantic: Installed. No version info available. oracle-ads: Installed. No version info available. oracledb: Installed. No version info available. orjson: 3.10.7 packaging: 24.1 pandas: 2.1.4 pdfminer-six: 20240706 pgvector: Installed. No version info available. pillow: 9.4.0 praw: Installed. No version info available. premai: Installed. No version info available. psychicapi: Installed. No version info available. py-trello: Installed. No version info available. pydantic: 2.8.2 pyjwt: 2.9.0 pymupdf: Installed. No version info available. pypdf: 4.2.0 pypdfium2: Installed. No version info available. pyspark: Installed. No version info available. PyYAML: 6.0.2 rank-bm25: Installed. No version info available. rapidfuzz: Installed. No version info available. rapidocr-onnxruntime: Installed. No version info available. rdflib: Installed. No version info available. requests: 2.32.3 requests-toolbelt: Installed. No version info available. rspace_client: Installed. No version info available. scikit-learn: 1.3.2 SQLAlchemy: 2.0.32 sqlite-vss: Installed. No version info available. streamlit: Installed. No version info available. sympy: 1.13.1 telethon: Installed. No version info available. tenacity: 8.5.0 tidb-vector: Installed. No version info available. tiktoken: 0.7.0 timescale-vector: Installed. No version info available. tqdm: 4.66.5 tree-sitter: Installed. No version info available. tree-sitter-languages: Installed. No version info available. typer: 0.12.3 typing-extensions: 4.12.2 upstash-redis: Installed. No version info available. vdms: Installed. No version info available. xata: Installed. No version info available. xmltodict: Installed. No version info available.

keenborder786 commented 3 weeks ago

@AkashBais in the Prompt do not use query but use question and it should work:


CYPHER_GENERATION_TEMPLATE = '''Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
Examples: Here are a few examples of generated Cypher statements for particular questions:

Which sections talk about medication?
MATCH (n2:ner_entity)-[new_present_in:PRESENT_IN]->(n1:Chunk)
WHERE n2.name = 'medication'
RETURN DISTINCT n1.section as section_list

The question is:
{question} '''

CYPHER_GENERATION_PROMPT = PromptTemplate(
input_variables=['schema', 'question'],
template=CYPHER_GENERATION_TEMPLATE
)
# print(CYPHER_GENERATION_PROMPT.input_variables)
QA_CHAIN = GraphCypherQAChain.from_llm(
graph=return_graph(), # This is an internal method that returns a graph object
llm=llm,
verbose=True,
cypher_prompt=CYPHER_GENERATION_PROMPT,
)
QA_CHAIN.invoke(
input = {'question': query},
return_only_outputs=True,
)
AkashBais commented 3 weeks ago

@keenborder786 I did try that but i get the same error. I noticed that the implimentation of GraphCypherQAChain has query hardcoded in it as input key. "input_key: str = "query" #: :meta private:" This key is used to populate question variable in _call method that gets passed to the cypher_generation_chain question = inputs[self.input_key] '''some other code''' generated_cypher = self.cypher_generation_chain.run( {"question": question, "schema": self.graph_schema}, callbacks=callbacks )

PFA the stacktrace P.S: I made the change in Prompt and PromptTemplate too

ValueError                                Traceback (most recent call last)
[<ipython-input-7-21a65eed48c5>](https://localhost:8080/#) in <cell line: 1>()
----> 1 responce = rag.query(
      2     query = 'Which sections talk about medication in the document?'
      3 )

3 frames
[/content/RAG/KG_for_RAG/src/execute_rag.py](https://localhost:8080/#) in query(self, query)
    337 
    338     if (self.query_type is not None) and (self.query_type.lower() in ['self_genrate','prompt']):
--> 339       answer = self.QA_CHAIN.invoke(
    340           {"question": query,},
    341           return_only_outputs=True,

[/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py](https://localhost:8080/#) in invoke(self, input, config, **kwargs)
    164         except BaseException as e:
    165             run_manager.on_chain_error(e)
--> 166             raise e
    167         run_manager.on_chain_end(outputs)
    168 

[/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py](https://localhost:8080/#) in invoke(self, input, config, **kwargs)
    152         )
    153         try:
--> 154             self._validate_inputs(inputs)
    155             outputs = (
    156                 self._call(inputs, run_manager=run_manager)

[/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py](https://localhost:8080/#) in _validate_inputs(self, inputs)
    282         missing_keys = set(self.input_keys).difference(inputs)
    283         if missing_keys:
--> 284             raise ValueError(f"Missing some input keys: {missing_keys}")
    285 
    286     def _validate_outputs(self, outputs: Dict[str, Any]) -> None:

ValueError: Missing some input keys: {'query'}
AkashBais commented 3 weeks ago

@keenborder786 Would appriciate any help or direction on this. Or just a list of plausible causes for this error or something to start looking into . Let me know .Thanks

AkashBais commented 3 weeks ago

For anyone looking for a solution. Step 1: Set the key for question in QA_CHAIN.invoke as query because thats the key that is hard coded in the GraphCypherQAChain and it checks for it's existance. Hence the error [Check the comment above]

Step 2: Set the input variable for the prompt and PromptTemplae as question because when QA_CHAIN.invoke calls it's internal _call method it dose so using

generated_cypher = self.cypher_generation_chain.run(
{"question": question, "schema": self.graph_schema}, callbacks=callbacks
)

And so the question passed along with the key query is passed down further as question