neo4j / apoc

Apache License 2.0
95 stars 28 forks source link

Exception while adding/removing apoc.trigger in neo4j causal cluster #19

Closed neo-technology-build-agent closed 1 year ago

neo-technology-build-agent commented 2 years ago

Issue by SergeyPlatonov Thursday Jul 21, 2022 at 17:41 GMT Originally opened as https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/3073


Guidelines

In cluster mode (Neo4j Causal Cluster), Apoc triggers can only be added/removed using the bolt scheme and IP where the system database has the role of leader. I think they should work like indexes and constraints - we can apply them using the neo4j scheme with SSR (Server Side Routing) enabled without thinking about who is the leader now.

Expected Behavior (Mandatory)

Add/remove triggers in neo4j causal cluster using neo4j scheme and any server (any IP).

Actual Behavior (Mandatory)

Triggers can only be applied using the bolt scheme and IP where the system database has the leader role. Using neo4j scheme I see an exception: No longer possible to write to server at 10.62.62.180:7687.

Using a bolt scheme and not a leader for the system database: Neo.ClientError.Cluster.NotALeader No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER

How to Reproduce the Problem

Run remove any trigger (even non-existent one) CALL apoc.trigger.remove('testTrigger'); You should use the neo4j causal cluster. Connect to neo4j using the scheme neo4j.

Simple Dataset (where it's possibile)

Run remove any trigger (even non-existent one) CALL apoc.trigger.remove('testTrigger'); You should use the neo4j causal cluster. Connect to neo4j using the scheme neo4j.

Steps (Mandatory)

  1. CALL apoc.trigger.remove('testTrigger');

Screenshots (where it's possibile)

Specifications (Mandatory)

Currently used versions

Versions

gwvandesteeg commented 1 year ago

Fun fact (tested on Neo4J 4.0.7)

Adding a trigger can only be done on the node in the cluster that is the LEADER of both the DB you are adding the trigger to AND the system database (might need the neo4j DB as well, wasn't sure, but we don't use it).

The example below is me trying to add a trigger whilst connected to the node neo4j-core-2 via the bolt connector

neo4j@nextvoice> call dbms.cluster.overview();
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                     | addresses                                                                                                                | databases                                                      | groups |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "53f95bdf-0c86-4826-8244-4ad4f7963592" | ["bolt://neo4j-core-2.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-2.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "LEADER", neo4j: "FOLLOWER", system: "FOLLOWER"}   | []     |
| "6b74a7fa-626d-4994-af32-1432b9e8b0c4" | ["bolt://neo4j-core-0.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-0.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "LEADER", system: "LEADER"}     | []     |
| "775b45fe-3ae3-466d-9ad2-7b8e5ae82e0b" | ["bolt://neo4j-core-1.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-1.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "FOLLOWER", system: "FOLLOWER"} | []     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3 rows available after 6 ms, consumed after another 1 ms
neo4j@nextvoice> CALL apoc.trigger.add(
                 "assertExtensionNumberValidNumericalString",
                 "WITH '^([0-9]{2,5})$' AS extNumStrRegex
                 MATCH (e:Extension)
                 CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
                 RETURN NULL",
                 { phase: 'before' }
                 );
No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER

After a bunch of killing nodes and waiting for them to come back to the desired state, and connected to neo4j-core-0 via the bolt connector

neo4j@nextvoice> call dbms.cluster.overview();
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                     | addresses                                                                                                                | databases                                                      | groups |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "53f95bdf-0c86-4826-8244-4ad4f7963592" | ["bolt://neo4j-core-2.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-2.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "FOLLOWER", neo4j: "FOLLOWER", system: "FOLLOWER"} | []     |
| "6b74a7fa-626d-4994-af32-1432b9e8b0c4" | ["bolt://neo4j-core-0.neo4j.default.svc.cluster.local:7687", "http://neo4j-core-0.neo4j.default.svc.cluster.local:7474"] | {nextvoice: "LEADER", neo4j: "LEADER", system: "LEADER"}       | []     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

2 rows available after 0 ms, consumed after another 1 ms
neo4j@nextvoice> CALL apoc.trigger.add(
                 "assertExtensionNumberValidNumericalString",
                 "WITH '^([0-9]{2,5})$' AS extNumStrRegex
                 MATCH (e:Extension)
                 CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
                 RETURN NULL",
                 { phase: 'before' }
                 );
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name                                        | query                                                                                                                                                                              | selector          | params | installed | paused |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "assertExtensionNumberValidNumericalString" | "WITH '^([0-9]{2,5})$' AS extNumStrRegex
MATCH (e:Extension)
CALL apoc.util.validate((NOT e.number =~ extNumStrRegex), '%s not a valid extension number', [e.number])
RETURN NULL" | {phase: "before"} | {}     | TRUE      | FALSE  |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 row available after 10 ms, consumed after another 30 ms
gem-neo4j commented 1 year ago

APOC Triggers have been updated to be more supportive of cluster environments, please refer to our documentation for more info :)