Open LikhithRishi opened 1 month ago
To successfully delete nodes from the PropertyGraphIndex
, ensure that you are using the correct instance of SimplePropertyGraphStore
for both retrieving and deleting nodes. In your code, you are creating a new instance of SimplePropertyGraphStore
when calling the delete
method, which does not affect the persisted graph store. Instead, you should perform the deletion on the same instance that you used to retrieve the nodes. Here's how you can adjust your code:
# Load the persisted graph store
graph_store = SimplePropertyGraphStore.from_persist_dir("./pg_storage")
# Retrieve nodes based on properties
nodes = graph_store.get(properties={"file_name": '1_pgs_doc.pdf'})
delete_from_pg = [node.id for node in nodes]
# Delete nodes using the same graph store instance
graph_store.delete(ids=delete_from_pg)
# Persist the changes
pg_index_chunk.storage_context.persist(persist_dir="./pg_storage")
By using the same graph_store
instance for both retrieval and deletion, you ensure that the changes are applied to the correct graph store. This should resolve the issue of nodes not being deleted as expected.
To continue talking to Dosu, mention @dosu.
@dosu after executing that code now we are getting some key error for which logs are:
graph_store.delete(ids=delete_from_pg)
File "/home/likhithrishi/miniconda3/envs/gen/lib/python3.12/site-packages/llama_index/core/graph_stores/simple_labelled.py", line 149, in delete
triplets = self.get_triplets(
^^^^^^^^^^^^^^^^^^
File "/home/likhithrishi/miniconda3/envs/gen/lib/python3.12/site-packages/llama_index/core/graph_stores/simple_labelled.py", line 70, in get_triplets
triplets = self.graph.get_triplets()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/likhithrishi/miniconda3/envs/gen/lib/python3.12/site-packages/llama_index/core/graph_stores/types.py", line 152, in get_triplets
self.nodes[subj],
~~~~~~~~~~^^^^^^
KeyError: 'aa3b866f-f990-466f-9906-ddd0c1bf6584'
The KeyError
you're encountering suggests that the node with the ID 'aa3b866f-f990-466f-9906-ddd0c1bf6584'
is being referenced in a triplet, but it does not exist in the nodes
dictionary of the graph
. This can happen if the node was deleted or never existed in the first place, but the triplet still references it.
To resolve this issue, you can take the following steps:
Check Node Existence: Before attempting to delete nodes, ensure that all node IDs in delete_from_pg
actually exist in the graph.nodes
dictionary. You can do this by verifying the existence of each node ID in the dictionary.
Handle Missing Nodes Gracefully: Modify the delete
method to handle cases where a node ID might not exist in the nodes
dictionary. This can prevent the KeyError
from being raised.
Here's an example of how you might adjust the code to check for node existence:
# Load the persisted graph store
graph_store = SimplePropertyGraphStore.from_persist_dir("./pg_storage")
# Retrieve nodes based on properties
nodes = graph_store.get(properties={"file_name": '1_pgs_doc.pdf'})
delete_from_pg = [node.id for node in nodes]
# Check for node existence before deletion
existing_ids = set(graph_store.graph.nodes.keys())
delete_from_pg = [node_id for node_id in delete_from_pg if node_id in existing_ids]
# Delete nodes using the same graph store instance
graph_store.delete(ids=delete_from_pg)
# Persist the changes
pg_index_chunk.storage_context.persist(persist_dir="./pg_storage")
By ensuring that you only attempt to delete nodes that exist, you can avoid the KeyError
. If the issue persists, you may need to investigate further to ensure that the graph's state is consistent and that no operations are inadvertently leaving orphaned references in the triplets.
To continue talking to Dosu, mention @dosu.
@dosu Still same issue
triplets = self.get_triplets(
^^^^^^^^^^^^^^^^^^
File "/home/likhithrishi/miniconda3/envs/gen/lib/python3.12/site-packages/llama_index/core/graph_stores/simple_labelled.py", line 70, in get_triplets
triplets = self.graph.get_triplets()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/likhithrishi/miniconda3/envs/gen/lib/python3.12/site-packages/llama_index/core/graph_stores/types.py", line 152, in get_triplets
self.nodes[subj],
~~~~~~~~~~^^^^^^
KeyError: 'e7497956-d6be-49f8-aeb4-83978f78efd2'
The KeyError
you're encountering indicates that the node with the ID 'e7497956-d6be-49f8-aeb4-83978f78efd2'
is being referenced in a triplet, but it does not exist in the nodes
dictionary. This can occur if the node was deleted or never existed, but the triplet still references it.
To address this issue, you can modify the get_triplets
method to handle missing nodes more gracefully. Here's a potential approach:
self.nodes[subj]
, check if subj
exists in self.nodes
. If it doesn't, skip that triplet.Here's an example of how you might adjust the get_triplets
method:
def get_triplets(self) -> List[Triplet]:
"""Get all triplets."""
valid_triplets = []
for subj, rel, obj in self.triplets:
if subj in self.nodes and obj in self.nodes:
valid_triplets.append((
self.nodes[subj],
self.relations[
self._get_relation_key(obj_id=obj, subj_id=subj, rel_id=rel)
],
self.nodes[obj],
))
return valid_triplets
This modification ensures that only triplets with valid node references are processed, preventing KeyError
from being raised. If the issue persists, you may need to further investigate the consistency of your graph data to ensure that all triplets have valid node references.
To continue talking to Dosu, mention @dosu.
Question Validation
Question
We are working with the PropertyGraphIndex and trying to delete nodes based on their properties. Below is a summary of the approach we're following:
1. Creating the Property Graph Index:
2. Loading the Saved Property Graph:
3. Retrieving and Deleting Nodes: We retrieve nodes based on properties and attempt to delete them:
Issue:
We are able to retrieve the nodes successfully using the get function, but the delete function doesn't seem to remove the nodes from the property graph. After attempting to delete the nodes and persisting the context again, the nodes still appear to exist in the property graph index.
Question: How can we successfully delete nodes from the PropertyGraphIndex? Are we missing any steps in the deletion process or in get process?