Open risafj opened 6 days ago
Hi @risafj ,
Indeed, this behavior is quite annoying, we'll take a closer look.
In the meantime, you can control this prefix by setting it in a LexicalGraphConfig
, which is a run parameter of the entity and relation extractor.
So you code will look like this:
from neo4j_graphrag.experimental.components.types import LexicalGraphConfig
config = LexicalGraphConfig(
id_prefix="myPrefix",
)
await pipe.run(data={
# ...
"extractor": {
# ...
"lexical_graph_config": config,
}
})
Let me know if you need more assistance.
Are you using a custom entity and relation extractor?
Hi @stellasia ,
Thank you so much for the quick turnaround and helpful response! Your solution worked perfectly!
Are you using a custom entity and relation extractor?
No, I'm using the one defined in this library:
from neo4j_graphrag.experimental.components.entity_relation_extractor import (
LLMEntityRelationExtractor, OnError)
extractor = LLMEntityRelationExtractor(
llm=llm,
on_error=OnError.RAISE,
prompt_template=custom_prompt,
)
Thank you for raising the issue and the information, we will investigate this shortly.
When I run the
Pipeline()
on a loop with multiple documents, a Chunk node with an id property of":1"
and index of1
is created for each run. This causes problems, since the ids are no longer unique.For example, when the lexical graph gets created, a Chunk node with an id of
":1"
has a NEXT_NODE relation to every Chunk node that has an id of ":2".After running the pipeline with 4 documents, it looks like this:
The same issue is occuring with FROM_CHUNK, where an entity that's supposed to have a relation like
(n:Entity)-[:FROM_CHUNK]->(c:Chunk {id: ":1", index: "1"})
actually has that relation to all documents' chunks with an index of 1.Is there any workaround for this? I'm guessing this issue would be solved if I could somehow pass document-specific id_prefix so each chunk gets a unique id?
https://github.com/neo4j/neo4j-graphrag-python/blob/bc6dd9c7b3f8fcfffb9ed360648ea80c6cbb17dc/src/neo4j_graphrag/experimental/components/lexical_graph.py#L78-L79
Additional info: I use v1.2.0. I have a standard pipeline setup that has these components.