neo4j / apoc

Apache License 2.0
81 stars 27 forks source link

single node handling in apoc.refactor.mergeNodes #644

Closed TomVanWemmel closed 1 month ago

TomVanWemmel commented 1 month ago

In apoc.refactor.mergeNodes there is no check on equality between the given nodes. All relationships are iterated to check that they would refer to the node itself. It might be beneficial to return early when the node has many relationships.

gem-neo4j commented 1 month ago

Hi! The nodes are turned into a linked set before being processed, so they should be unique and none of them should match the first node :)

TomVanWemmel commented 3 weeks ago

Hi @gem-neo4j

All relationships to the first nodes are evaluated. This piece of code is executed regardless of the size of the set.

final List<String> existingSelfRelIds = conf.isPreservingExistingSelfRels()
        ? StreamSupport.stream(first.getRelationships().spliterator(), false)
                .filter(Util::isSelfRel)
                .map(Entity::getElementId)
                .collect(Collectors.toList())
        : Collections.emptyList();

I'm using this procedure where almost every time a single node is merged but after some time it becomes very slow for nodes with a lot of relationships (> 1000).