Closed sapalli2989 closed 7 months ago
Hi @sapalli2989,
isn't the purpose of WITH to separate query parts, so there isn't any read operation directly following after update/delete?
My understanding of WITH is to pipe the result from previous query part to the next query part. In your case, MATCH (t2:T {id: i}) DELETE t2
will always be executed regardless of whether MATCH (t:T {id: 1}) DELETE t
deletes a t
or not.
So I think the query is equivalent to MATCH (t:T {id: 1}) DELETE t;
+ MATCH (t2:T {id: 2}) DELETE t2;
. On the other hand, if you have some cases like MATCH (t:T) DELETE t WITH t.id as ID MATCH (t2) WHERE t2.ID > ID DELETE t2
then the two query parts are indeed dependent and you cannot write them as two statements.
A corresponding query structure with multiple WITH and MERGE (instead of DELETE) can be found here (page bottom) for reference.
MERGE(y:age{name:"LessThanFifty"});//Create an Age group node
MERGE(o:age{name:"MoreThanFifty"});//Create an Age group node
MATCH(m:Movie{title:'A Few Good Men'}) //Match the few Good Men Movie
WITH m
MATCH (p1:Person)-[:ACTED_IN]->(m) WHERE p1.born <= 1972
MERGE(p1)-[:OF_AGE]->(o)
WITH m
MATCH (p2:Person)-[:ACTED_IN]->(m) WHERE p2.born > 1972
MERGE(p2)-[:OF_AGE]->(y)
In this example, the last two query parts are actually dependent because if MATCH (p1:Person)-[:ACTED_IN]->(m) WHERE p1.born <= 1972
finds no tuple, MATCH (p2:Person)-[:ACTED_IN]->(m) WHERE p2.born > 1972
will get zero input and thus not being executed.
Would be curious to hear, if this is this going to be supported.
The answer is yes. We didn't support read after update for simplicity because we SQL standard also doesn't allow read after update in one statement. So we thought this is not an important use case. Since now there is user demand, we certainly can implement this. Though different from bug-fix, this is a relatively big feature and we should take a few weeks to implement and test.
Let me know if you have further questions.
Hi @andyfengHKU,
On the other hand, if you have some cases like
MATCH (t:T) DELETE t WITH t.id as ID MATCH (t2) WHERE t2.ID > ID DELETE t2
then the two query parts are indeed dependent and you cannot write them as two statements.
Yes, exactly. Your example and the one from Neo4j docs are better than mine, as they express those dependencies I have between query parts. The dependent pars are why I cannot easily split it up in multiple statements.
Let me give you an example, which tries to reflect my case a bit more:
Imagine a strict hierarchical relation named :AFTER
for nodes T
as part of a bigger graph. I think its actually helpful to have the model of a linked list of programming in mind here (but this list does not need to be strictly continuous). Now we want to select a specific node and move it AFTER
some target node. In addition affected parts need to be reconnected, so no "holes" are created.
Minimal example: Extract t2
and move it after t6
, so there is t2-[:AFTER]->t6
.
We need to drop e1
,e2
, e4
and "reconnect" t1
with t3
, t5
with t2
, lastly t2
with t6
, which is our actual goal.
This is a quite error-prone operation without graph features. But Cypher seems to be able to solve this operation very elegantly. I used following query in Neo4j successfully - it doesn't work with Kuzu yet:
WITH "t2" as selected_id, "t6" as move_after_id
MATCH (s:T {id: selected_id})
OPTIONAL MATCH (a:T {id: move_after_id})
OPTIONAL MATCH (t1:T)-[e1:AFTER]->(s)
OPTIONAL MATCH (s)-[e2:AFTER]->(t3:T)
OPTIONAL MATCH (t5:T)-[e4:AFTER]->(a)
DELETE e1,e2,e4
CREATE (s)-[:AFTER]->(a)
WITH t1,t3,t5,s MATCH(t1) MATCH(t3) CREATE (t1)-[:AFTER]->(t3)
WITH t5,s MATCH(t5) CREATE (t5)-[:AFTER]->(s);
(still in my early learning phase with Cypher, there might be a better solution)
OPTIONAL MATCH ... DELETE
wasn't possible due to #2293. Tried to workaround via #2300 and ultimately landed here 🙂WITH...MATCH
for CREATE
operations as described in linked Neo4j article (not using any internal flags alernatives)OPTIONAL MATCH
is needed, because some of the relations might not exist. For example t2
could be the very start of a linked list, so there is no relation (t1)-[:AFTER]->(t2)
. But we want to continue here, just ignoring non-existent relations as noop.a
might be null
from query input perspective, hence OPTIONAL MATCH
is used here as well.If I feed query with above sample data:
CREATE (t1:T {id: "t1"})-[e1:AFTER {id: "e1"}]->(t2:T {id: "t2"})-[e2:AFTER {id: "e2"}]->(t3:T {id: "t3"}), (t4:T {id: "t4"})-[e3:AFTER {id: "e3"}]->(t5:T {id: "t5"})-[e4:AFTER {id: "e4"}]->(t6:T {id: "t6"})-[e5:AFTER {id: "e5"}]->(t7:T {id: "t7"})
, the result is as expected - not tested thoroughly though:
The answer is yes.
Not 42? 😂 Joking aside, happy to hear this is on the roadmap! From my own learning experience, I mostly tested queries with Neo4j web explorer first and then tried to execute it with Kuzu CLI. Assuming many folks do this, it probably makes sense for Neo4j -> Kuzu to be "copy and paste" query-wise.
So we thought this is not an important use case. Since now there is user demand, we certainly can implement this.
Hope to have given a suitable example case that actually makes sense. Thanks much for your time!
@sapalli2989 The example you gave make a lot of sense to me. I'll assign this issue to one of my colleagues to make sure Kùzu can run the example properly. I don't have an ETA for the fix now but certainly we will make it available in our November release.
From my own learning experience, I mostly tested queries with Neo4j web explorer first and then tried to execute it with Kuzu CLI.
FYI. We are releasing a Web explorer this week as well. Will announce in the slack when releasing.
it probably makes sense for Neo4j -> Kuzu to be "copy and paste" query-wise
Yes copy-paste is our ultimate goal. Currently there are still a few features that are missing but we should cover the majority by the end of this year. Though Kùzu will always ask for DDL because we believe this is the best practice for query performance and data integrity.
Wonderful. Thank you, sir!
Hi @andyfengHKU. Is there already any development branch or patch, where I could help with testing?
Hi @sapalli2989, we actually need a bit more time to handle read after update due to some edge cases. Though for your use case, I have added a test case that does not require read after update.
-CASE 2303
-STATEMENT CREATE NODE TABLE T (id STRING, PRIMARY KEY(id));
---- ok
-STATEMENT CREATE REL TABLE AFTER (FROM T TO T, id STRING);
---- ok
-STATEMENT CREATE (t1:T {id: "t1"})-[e1:AFTER {id: "e1"}]->(t2:T {id: "t2"})-[e2:AFTER {id: "e2"}]->(t3:T {id: "t3"}),
(t4:T {id: "t4"})-[e3:AFTER {id: "e3"}]->(t5:T {id: "t5"})-[e4:AFTER {id: "e4"}]->(t6:T {id: "t6"})-[e5:AFTER {id: "e5"}]->(t7:T {id: "t7"})
---- ok
-STATEMENT MATCH p = (a:T {id:"t4"})-[e*]->(b:T {id:"t7"}) RETURN COUNT(*);
---- 1
1
-STATEMENT MATCH (s:T {id: "t2"}), (a:T {id: "t6"}),
(t1:T)-[e1:AFTER]->(s),
(s)-[e2:AFTER]->(t3:T),
(t5:T)-[e4:AFTER]->(a)
DELETE e1,e2,e4
CREATE (s)-[e7:AFTER]->(a)
CREATE (t1)-[e8:AFTER]->(t3)
CREATE (t5)-[e9:AFTER]->(s);
---- ok
-STATEMENT MATCH (a)-[e]->(b) RETURN COUNT(*);
---- 1
5
-STATEMENT MATCH p = (a:T {id:"t4"})-[e*]->(b:T {id:"t7"}) RETURN COUNT(*);
---- 1
1
Seems that you just need multiple update rather than read after update. Let me know if the above solution works.
I'll also give a bit more details regarding read after update. We are solving an edge case where update creates dependency between query parts.
E.g. MATCH (a:Person) SET a.age = 21 WITH * MATCH (b:Person) RETURN b.age
Normally we assume MATCH (a:Person)
and MATCH (n:Person)
can be executed in arbitrary order. But with update, MATCH (a:Person)
needs to be executed first so that MATCH (b:Person)
can read the updated value.
we actually need a bit more time to handle read after update due to some edge cases.
No problem, take your time.
Though for your use case, I have added a test case that does not require read after update.
Let me know if the above solution works.
OK, let's modify the example slightly.
Given original start data
, I'd now like to select t1
(instead of t2
) and move it :AFTER
t6
like before. Result should look like:
Your query wouldn't do anything here, because the whole query is aborted with MATCH
, if one condition isn't fulfilled.
Here condition (t1:T)-[e1:AFTER]->(s)
isn't fulfilled, as t1
is right at the start and there isn't any node before.
Assuming, the actual location of selected node is unknown, I'd like to keep query as generic as possible - so that we don't need to care about its location! For now let's revert to OPTIONAL MATCH
:
MATCH (s:T {id: "t1"}), (a:T {id: "t6"})
OPTIONAL MATCH (t0:T)-[e0:AFTER]->(s)
OPTIONAL MATCH (s)-[e1:AFTER]->(t2:T)
OPTIONAL MATCH (t5:T)-[e4:AFTER]->(a)
DELETE e0,e1,e4
CREATE (s)-[:AFTER]->(a)
WITH t0,t2,t5,s MATCH(t0) MATCH(t2) CREATE (t0)-[:AFTER]->(t2)
WITH t5,s MATCH(t5) CREATE (t5)-[:AFTER]->(s);
Neo4j is almost there:
But relation t5-[:AFTER]->t1
is still missing. By looking at:
WITH t0,t2,t5,s MATCH(t0) MATCH(t2) CREATE (t0)-[:AFTER]->(t2) // (1)
WITH t5,s MATCH(t5) CREATE (t5)-[:AFTER]->(s); // (2)
, MATCH
again seems to abort the query eagerly in (1)
, as t0
does not exist. So WITH ...
in (2)
is not executed anymore.
I guess the ideal query can use OPTIONAL MATCH
in combination with CREATE
, as follows:
MATCH (s:T {id: "t1"}), (a:T {id: "t6"})
OPTIONAL MATCH (t0:T)-[e0:AFTER]->(s)
OPTIONAL MATCH (s)-[e1:AFTER]->(t2:T)
OPTIONAL MATCH (t5:T)-[e4:AFTER]->(a)
DELETE e0,e1,e4
CREATE (s)-[:AFTER]->(a), (t0)-[:AFTER]->(t2), (t5)-[:AFTER]->(s);
OPTIONAL MATCH
, CREATE
and WITH
(Below is my previous idea, but I find it less than ideal)
Neo4j currently doesn't support OPTIONAL MATCH ... CREATE
, as reported in above Neo4j article and errors with
Failed to create relationship
UNNAMED1
, nodet0
is missing. If you prefer to simply ignore rows where a relationship node is missing, set 'dbms.cypher.lenient_create_relationship = true' in neo4j.conf
For me, being able to query exactly these complex graph cases (imagine model of linked list again) would be one of the main advantages compared to SQL. Is there a chance to get this supported in Kuzu?
Sorry, if this feature request should go in yet another direction. Sometimes ideas need time to grow.
I'll also give a bit more details regarding read after update.
Thanks! I stay tuned :-)
I see your point. Though I think you can achieve what you want with just OPTIONAL MATCH
+ UPDATE
. I suggest the following
MATCH (s:T {id: <input>}), (a:T {id: <target>})
OPTIONAL MATCH (t1:T)-[e1:AFTER]->(s)
OPTIONAL MATCH (s)-[e2:AFTER]->(t3:T)
OPTIONAL MATCH (t5:T)-[e4:AFTER]->(a)
DELETE e1,e2,e4
CREATE (s)-[e7:AFTER]->(a)
CREATE (t1)-[e8:AFTER]->(t3)
CREATE (t5)-[e9:AFTER]->(s);
Optional match will guarantee query not abort during match state. If certain pattern is not matched, e.g. (t1:T)-[e1:AFTER]->(s)
, then t1
and e1
will be NULL
, and DELETE e1
& CREATE (t1)-[e8:AFTER]->(t3)
will automatically be ignored in Kùzu.
I'll attach the full statements I ran in KuzuExplorer.
-STATEMENT CREATE NODE TABLE T (id STRING, PRIMARY KEY(id));
-STATEMENT CREATE REL TABLE AFTER (FROM T TO T, id STRING);
-STATEMENT CREATE (t1:T {id: "t1"})-[e1:AFTER {id: "e1"}]->(t2:T {id: "t2"})-[e2:AFTER {id: "e2"}]->(t3:T {id: "t3"}),
(t4:T {id: "t4"})-[e3:AFTER {id: "e3"}]->(t5:T {id: "t5"})-[e4:AFTER {id: "e4"}]->(t6:T {id: "t6"})-[e5:AFTER {id: "e5"}]->(t7:T {id: "t7"})
-STATEMENT MATCH (s:T {id: "t1"}), (a:T {id: "t6"})
OPTIONAL MATCH (t1:T)-[e1:AFTER]->(s)
OPTIONAL MATCH (s)-[e2:AFTER]->(t3:T)
OPTIONAL MATCH (t5:T)-[e4:AFTER]->(a)
DELETE e1,e2,e4
CREATE (s)-[e7:AFTER]->(a)
CREATE (t1)-[e8:AFTER]->(t3)
CREATE (t5)-[e9:AFTER]->(s);
Before updating
After updating
Ah great. My bad, I had the assumption, above query wouldn't run in Kuzu yet. But you're right, I used latest Kuzu DB + explorer and indeed this works - better than Neo4j!
Read after update should now be supported in #3126
(Follow-up of https://github.com/kuzudb/kuzu/issues/2300)
Contrived example:
This query works as is with Neo4j.
Not sure about the error message - isn't the purpose of
WITH
to separate query parts, so there isn't any read operation directly following after update/delete? Would be curious to hear, if this is this going to be supported.To add some context: In my case there are multiple optional, somehow dependent nodes or relations, which are to be deleted, if they exist. I used multiple
MATCH ... DELETE
constructs for this purpose (also to omit error in #2300). These deletes are connected byWITH
to keep logic in one query, as there are dependent parts - I don't want to use multiple statements here. A corresponding query structure with multipleWITH
andMERGE
(instead ofDELETE
) can be found here (page bottom) for reference.