langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.18k stars 15.22k forks source link

GraphCypherQAChain tries to create a query from a sensitive question. #20280

Closed ErikValle2 closed 3 months ago

ErikValle2 commented 6 months ago

Checked other resources

Example Code

import time

from langchain.graphs import Neo4jGraph
from langchain_openai import AzureChatOpenAI

from langchain.prompts.prompt import PromptTemplate
from langchain.chains import GraphCypherQAChain

llm=AzureChatOpenAI(azure_deployment=MODEL_CHAT, model_name=MODEL_CHAT, azure_endpoint=API_ENDPOINT, openai_api_version=API_VERSION, openai_api_key=API_KEY, temperature=0, streaming=True)
neo4j_graph = Neo4jGraph(url=NEO4J_URI, username=NEO4J_USERNAME, password=NEO4J_PASSWORD)

CYPHER_GENERATION_TEMPLATE = """You are an expert Neo4j Cypher translator who understands the question in english and convert to Cypher strictly based on the Neo4j Schema provided and following the instructions below:
<instructions>
* Use aliases to refer the node or relationship in the generated Cypher query
* Generate Cypher query compatible ONLY for Neo4j Version 5
* Do not use EXISTS, SIZE keywords in the cypher. Use alias when using the WITH keyword
* Use only Nodes and relationships mentioned in the schema
* Always enclose the Cypher output inside 3 backticks (```)
* Always do a case-insensitive and fuzzy search for any properties related search. Eg: to search for a Person name use `toLower(p.name) contains 'neo4j'`
* Cypher is NOT SQL. So, do not mix and match the syntaxes

</instructions>

Strictly use this Schema for Cypher generation:
<schema>
{schema}
</schema>

The samples below follow the instructions and the schema mentioned above. So, please follow the same when you generate the cypher:
<samples>
Human:  Which manager manages most people directly? How many employees?
Assistant: ```MATCH (p:Person)-[r:IS_MANAGER_OF]->() WITH p, COUNT(r) AS NumberOfEmployees ORDER BY NumberOfEmployees DESC RETURN p.name, NumberOfEmployees LIMIT 1```

</samples>

Human: {question}
Assistant: 
"""

CYPHER_GENERATION_PROMPT = PromptTemplate(input_variables=['schema','question'], validate_template=True, template=CYPHER_GENERATION_TEMPLATE)

chain = GraphCypherQAChain.from_llm(
    llm,
    graph=neo4j_graph,
    cypher_prompt=CYPHER_GENERATION_PROMPT,
    validate_cypher=True,
    return_intermediate_steps=True
)

question="Who should we fire from the Example department?"
cypher_cmd=chain.invoke(question)

Error Message and Stack Trace (if applicable)

---------------------------------------------------------------------------
CypherSyntaxError                         Traceback (most recent call last)
File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain_community/graphs/neo4j_graph.py:164](.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain_community/graphs/neo4j_graph.py#line=163), in Neo4jGraph.query(self, query, params)
    163 try:
--> 164     data = session.run(Query(text=query, timeout=self.timeout), params)
    165     json_data = [r.data() for r in data]

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/session.py:313](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/session.py#line=312), in Session.run(self, query, parameters, **kwargs)
    312 parameters = dict(parameters or {}, **kwargs)
--> 313 self._auto_result._run(
    314     query, parameters, self._config.database,
    315     self._config.impersonated_user, self._config.default_access_mode,
    316     bookmarks, self._config.notifications_min_severity,
    317     self._config.notifications_disabled_categories,
    318 )
    320 return self._auto_result

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/result.py:181](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/result.py#line=180), in Result._run(self, query, parameters, db, imp_user, access_mode, bookmarks, notifications_min_severity, notifications_disabled_categories)
    180 self._connection.send_all()
--> 181 self._attach()

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/result.py:301](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/work/result.py#line=300), in Result._attach(self)
    300 while self._attached is False:
--> 301     self._connection.fetch_message()

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_common.py:178](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_common.py#line=177), in ConnectionErrorHandler.__getattr__.<locals>.outer.<locals>.inner(*args, **kwargs)
    177 try:
--> 178     func(*args, **kwargs)
    179 except (Neo4jError, ServiceUnavailable, SessionExpired) as exc:

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_bolt.py:849](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_bolt.py#line=848), in Bolt.fetch_message(self)
    846 tag, fields = self.inbox.pop(
    847     hydration_hooks=self.responses[0].hydration_hooks
    848 )
--> 849 res = self._process_message(tag, fields)
    850 self.idle_since = monotonic()

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_bolt5.py:369](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_bolt5.py#line=368), in Bolt5x0._process_message(self, tag, fields)
    368 try:
--> 369     response.on_failure(summary_metadata or {})
    370 except (ServiceUnavailable, DatabaseUnavailable):

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_common.py:245](.pyenv/versions/3.11.7/lib/python3.11/site-packages/neo4j/_sync/io/_common.py#line=244), in Response.on_failure(self, metadata)
    244 Util.callback(handler)
--> 245 raise Neo4jError.hydrate(**metadata)

CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input 'I': expected
  "ALTER"
  "CALL"
  "CREATE"
  "DEALLOCATE"
  "DELETE"
  "DENY"
  "DETACH"
  "DROP"
  "DRYRUN"
  "ENABLE"
  "FOREACH"
  "GRANT"
  "LOAD"
  "MATCH"
  "MERGE"
  "NODETACH"
  "OPTIONAL"
  "REALLOCATE"
  "REMOVE"
  "RENAME"
  "RETURN"
  "REVOKE"
  "SET"
  "SHOW"
  "START"
  "STOP"
  "TERMINATE"
  "UNWIND"
  "USE"
  "USING"
  "WITH" (line 1, column 1 (offset: 0))
"I'm sorry, I cannot generate a query for this question as it goes against ethical and moral principles. It is not appropriate to use data and technology to harm or discriminate against individuals."
 ^}

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[7], line 2
      1 question="Who should we fire from 91130 Veh Verif & Value Confirmation?"
----> 2 cypher_cmd=chain.invoke(question)

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/base.py:162](.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/base.py#line=161), in Chain.invoke(self, input, config, **kwargs)
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)
--> 162     raise e
    163 run_manager.on_chain_end(outputs)
    164 final_outputs: Dict[str, Any] = self.prep_outputs(
    165     inputs, outputs, return_only_outputs
    166 )

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/base.py:156](.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/base.py#line=155), in Chain.invoke(self, input, config, **kwargs)
    149 run_manager = callback_manager.on_chain_start(
    150     dumpd(self),
    151     inputs,
    152     name=run_name,
    153 )
    154 try:
    155     outputs = (
--> 156         self._call(inputs, run_manager=run_manager)
    157         if new_arg_supported
    158         else self._call(inputs)
    159     )
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/graph_qa/cypher.py:267](.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain/chains/graph_qa/cypher.py#line=266), in GraphCypherQAChain._call(self, inputs, run_manager)
    264 # Retrieve and limit the number of results
    265 # Generated Cypher be null if query corrector identifies invalid schema
    266 if generated_cypher:
--> 267     context = self.graph.query(generated_cypher)[: self.top_k]
    268 else:
    269     context = []

File [~/.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain_community/graphs/neo4j_graph.py:170](.pyenv/versions/3.11.7/lib/python3.11/site-packages/langchain_community/graphs/neo4j_graph.py#line=169), in Neo4jGraph.query(self, query, params)
    168     return json_data
    169 except CypherSyntaxError as e:
--> 170     raise ValueError(f"Generated Cypher Statement is not valid\n{e}")

ValueError: Generated Cypher Statement is not valid
{code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input 'I': expected
  "ALTER"
  "CALL"
  "CREATE"
  "DEALLOCATE"
  "DELETE"
  "DENY"
  "DETACH"
  "DROP"
  "DRYRUN"
  "ENABLE"
  "FOREACH"
  "GRANT"
  "LOAD"
  "MATCH"
  "MERGE"
  "NODETACH"
  "OPTIONAL"
  "REALLOCATE"
  "REMOVE"
  "RENAME"
  "RETURN"
  "REVOKE"
  "SET"
  "SHOW"
  "START"
  "STOP"
  "TERMINATE"
  "UNWIND"
  "USE"
  "USING"
  "WITH" (line 1, column 1 (offset: 0))
"I'm sorry, I cannot generate a query for this question as it goes against ethical and moral principles. It is not appropriate to use data and technology to harm or discriminate against individuals."
 ^}

Description

GraphCypherQAChain is trying to generate a Cypher query from the LLM's error message: "I'm sorry, I cannot generate a query for this question as it goes against ethical and moral principles. It is not appropriate to use data and technology to harm or discriminate against individuals." This code works for other prompts except for those containing sensitive questions or information outside the provided schema.

System Info

System Information

OS: Linux OS Version: #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 Python Version: 3.11.7 (main, Feb 15 2024, 09:21:29) [Clang 14.0.0 ]

Package Information

langchain_core: 0.1.30 langchain: 0.1.7 langchain_community: 0.0.20 langsmith: 0.1.22 langchain_cli: 0.0.21 langchain_openai: 0.0.8 langserve: 0.0.41

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph

GeorgeKontsevik commented 6 months ago

same