Open bechbd opened 3 weeks ago
To fix the bug in the TextToCypherRetriever
so that the cypher_validator
runs independently of the allowed_query_fields
parameter in version 0.10.65, you need to ensure that the cypher_validator
is always called regardless of whether allowed_query_fields
is set or not.
Here is the corrected code snippet:
def retrieve_from_graph(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
schema = self._graph_store.get_schema_str()
question = query_bundle.query_str
response = self.llm.predict(
self.text_to_cypher_template,
schema=schema,
question=question,
)
parsed_cypher_query = self._parse_generated_cypher(response) # Always validate the cypher query
if self.cypher_validator:
self.cypher_validator(parsed_cypher_query) # Ensure cypher_validator is called
query_output = self._graph_store.structured_query(parsed_cypher_query)
cleaned_query_output = self._clean_query_output(query_output)
node_text = self.response_template.format(
query=parsed_cypher_query,
response=str(cleaned_query_output),
)
return [
NodeWithScore(
node=TextNode(
text=node_text,
),
score=1.0,
)
]
async def aretrieve_from_graph(
self, query_bundle: QueryBundle
) -> List[NodeWithScore]:
schema = await self._graph_store.aget_schema_str()
question = query_bundle.query_str
response = await self.llm.apredict(
self.text_to_cypher_template,
schema=schema,
question=question,
)
parsed_cypher_query = self._parse_generated_cypher(response) # Always validate the cypher query
if self.cypher_validator:
self.cypher_validator(parsed_cypher_query) # Ensure cypher_validator is called
query_output = await self._graph_store.astructured_query(parsed_cypher_query)
cleaned_query_output = self._clean_query_output(query_output)
node_text = self.response_template.format(
query=parsed_cypher_query,
response=str(cleaned_query_output),
)
return [
NodeWithScore(
node=TextNode(
text=node_text,
),
score=1.0,
)
]
In this fix, the self._parse_generated_cypher(response)
is called directly after generating the response from the LLM, and the cypher_validator
is explicitly called if it is set, ensuring that the cypher_validator
is always executed [1].
To continue talking to Dosu, mention @dosu.
Yea seems like a bug, easy enough to fix :)
This issue is already resolved in main branch so we can close this issue now.
Bug Description
The TextToCypher retriever currently takes two separate parameters with different intents:
There is an issue in the code where it only runs the
cypher_validator
when theallowed_query_fields
is set. This does not make sense ascypher_validator
corrects the query before it is run andallowed_query_fields
limits the results returned after the query is run. These should operate independently.The checks in these two lines should be removed: https://github.com/run-llama/llama_index/blob/729d5f2e76cca29284157c6fd8b55fb9953739dd/llama-index-core/llama_index/core/indices/property_graph/sub_retrievers/text_to_cypher.py#L105 https://github.com/run-llama/llama_index/blob/729d5f2e76cca29284157c6fd8b55fb9953739dd/llama-index-core/llama_index/core/indices/property_graph/sub_retrievers/text_to_cypher.py#L139
There is not a need to add a check as it is already done in
_parse_generated_cyher
(also mispelled cypher)The
allowed_query_output
check is still performed here https://github.com/run-llama/llama_index/blob/729d5f2e76cca29284157c6fd8b55fb9953739dd/llama-index-core/llama_index/core/indices/property_graph/sub_retrievers/text_to_cypher.py#L73Version
0.10.65
Steps to Reproduce
Run a TextToCypher retriever but specify a
cypher_validator
but notallows_query_outputs
Relevant Logs/Tracbacks
No response