kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.37k stars 96 forks source link

Error: Binder exception: Cannot bind <var> as node pattern #2604

Open sapalli2989 opened 10 months ago

sapalli2989 commented 10 months ago

This issue continues https://github.com/kuzudb/kuzu/issues/2589.

CREATE NODE TABLE T (id STRING,PRIMARY KEY(id));
CREATE REL TABLE E (FROM T TO T);
CREATE (t1:T{id:"t1"})-[:E]->(t2:T{id:"t2"});

// user input from prepared statement
WITH "node_after" as selected 
OPTIONAL MATCH (t1:T{id: "t1"})-[:E]->(t2:T{id: "t2"}) 
WITH CASE WHEN selected = "node_after" THEN t2 ELSE t1 END as ret
OPTIONAL MATCH(ret) 
RETURN ret;

Error:

Error: Binder exception: Cannot bind ret as node pattern.

Expected: t2 for node_after, t1 for node_before Actual: See error Works with Neo4j: yes Kuzu version: Nightly

Description: Apparently ret cannot be used as pattern in MATCH/OPTIONAL MATCH if previously bound in CASE construct. It doesn't make any difference, if MATCH(ret) or OPTIONAL MATCH(ret) is used. I tried to make above example as minimal as possible, of course this is only contrived. Real case uses above CASE to process user input and then is combined with logic from https://github.com/kuzudb/kuzu/issues/2303#issuecomment-1833191732 (ret called a there).

andyfengHKU commented 10 months ago

Hi @sapalli2989, this is interesting bug (or un-support feature). It's a bit tricky to fix it quickly because ret is treated as an expression that is constructed at runtime rather than a query node that is known in compile time. I'll need to think a bit to figure out a long term solution.

I wonder if the following workaround works for u

OPTIONAL MATCH (a) WHERE a.id = ret.id RETURN a;
sapalli2989 commented 10 months ago

Thanks for the explanation. Hm when executing

WITH "node_after" as selected 
OPTIONAL MATCH (t1:T{id: "t1"})-[:E]->(t2:T{id: "t2"}) 
WITH CASE WHEN selected = "node_after" THEN t2 ELSE t1 END as ret
OPTIONAL MATCH (a) WHERE a.id = ret.id RETURN a;

, I get following error:

Error: Runtime exception: Join hash table cannot hash data type STRUCT


To give more background: https://github.com/kuzudb/kuzu/issues/2303#issuecomment-1833191732 offers a way to move some node s :AFTER some other node a. DB relation is named :AFTER. But program API also provides a method to move s before a. And that basically is moving s :AFTER node a+1, where (a)-[:AFTER]->(a+1); (a+1) might not exist, but this is unknown to client (OPTIONAL MATCH). Hence we can reuse the whole moving logic. To do that I'd like to simply check user input for the movement mode (either before or after) and then conditionally choose according target location for :AFTER via CASE.

andyfengHKU commented 9 months ago

Hi @sapalli2989, sorry for the late reply (somehow missed the message).

I wonder if the following change would help

WITH "node_after" as selected
OPTIONAL MATCH (t1:T{id: "t1"})-[:E]->(t2:T{id: "t2"})
WITH CASE WHEN selected = "node_after" THEN t2.id ELSE t1.id END as ret
OPTIONAL MATCH (a) WHERE a.id = ret RETURN a;

The change is mainly about projecting id directly as ret as used in subsequent query parts.

We have assigned related issue #2637 and should be able to fix it over the week. Meanwhile I would suggest use the alternative above as a workaround.

Feel free to ping us in slack channel if you didn't get a reply on GitHub in time.

Hope you enjoyed the holidays!

sapalli2989 commented 9 months ago

Hey @andyfengHKU, Thanks, hope you had nice holidays as well.

Regarding proposed workaround: Great! This indeed seems to work and definitely helps. If I have understood that issue correctly, using of node variables instead of projecting the id eagerly will be possible after hashing and comparisons for (nested) structs have been implemented, is that right? In other words, query from my above example will work again after https://github.com/kuzudb/kuzu/issues/2637 has been fixed:

...
WITH CASE WHEN selected = "node_after" THEN t2 ELSE t1 END as ret
OPTIONAL MATCH (a) WHERE a.id = ret.id RETURN a;

I also would be very curious, if you have found time to think about the original issue in the meanwhile? Having this feature implemented as opposed to marking it as unsupported would correspond much more to the declarative nature of openCypher (e.g. order of CASE and MATCH does not matter in query), so users don't need to bother about tricky constellations and edge cases. My personal criterion for new features is, if Neo4j as kind of reference database can execute the query, which it indeed is capable of.

ret is treated as an expression that is constructed at runtime rather than a query node that is known in compile time.

I am not able to make an informed decision about underlying implementation, but about how much performance impact are we speaking of, when "downgrading" affected expressions to be resolved at runtime instead if this constellation is given?

If this is a harder issue to implement, what about a quick fix for now to automatically and transparently expanding the user query with your suggested workaround (introduce one level of indirection with id comparison) instead of triggering an error? Goal would be that end users don't need to care.

Also a bit more user-friendly message would tremendously help :-). For example something like "Expressions from CASE cannot be reused in subsequent MATCH patterns" as short term solution.

sapalli2989 commented 4 months ago

Got this error with a similar query again. @andyfengHKU I think this can happen in general with WITH, not just with CASE right? Here is the example:

// Schema
CREATE NODE TABLE V (id INT64, name STRING, PRIMARY KEY(id));
CREATE (:V {id: 1, name: "v1"}),(:V {id: 2, name: "v2"}),(:V {id: 3, name: "v3"});
// Data
MATCH (v:V) where v.id < 3 // just some random condition to partition results as union parts
WITH collect(v) as lv1 
MATCH (v:V) where v.id >= 3
WITH collect(v) as lv2, lv1 
UNWIND lv1 + lv2 as v_union 
MATCH (workaround:V)  WHERE workaround.id = v_union.id // current workaround
RETURN workaround.name; // this would be some kind of union post-processing.

Ideally this should work without MATCH (workaround:V) WHERE workaround.id = v_union.id.

Background for above query is to substitute UNION by UNWIND, collect and WITH to do some kind of union-postprocessing. UNION equivalent would be:

// union-part processing here is just projecting to .name
// and needs to be done for both branches.
// (substitute by any arbitrarily complex mapping)
MATCH (v:V) where v.id < 3 RETURN v.name 
UNION
MATCH (v:V) WHERE v.id >= 3 RETURN v.name;

This currently is not possible in Kuzu, as UNION needs to be top-level statement and there are no sub-queries. I think Neoj4 v4+ uses CALL { ... } to enable the sort of processing.

andyfengHKU commented 4 months ago

Got this error with a similar query again. @andyfengHKU I think this can happen in general with WITH, not just with CASE right? Here is the example:

Yeah this is a general problem that cannot bind an expression as node pattern after projection. Looking at our roadmap I don't think I'll have time to address this until our v0.5.0 release (early July). I'll try to get on this (or get someone on this) shortly after the release.

as UNION needs to be top-level statement and there are no sub-queries. I think Neoj4 v4+ uses CALL { ... } to enable the sort of processing

UNION being the top-level statement was the openCypher standard and I think Neo4j has deviated from openCypher. I'll check the new GQL standard to see what's the best way to solve this.

sapalli2989 commented 2 months ago

Here is another example, that tries to conditionally CREATE something via CASE:

CREATE NODE TABLE V (id INT, PRIMARY KEY(id));
CREATE REL TABLE E (FROM V TO V);
CREATE (v1:V {id: 1}), (v2:V {id: 2}), (v3:V {id: 3});

MATCH (v1:V {id: 1}), (v2:V {id: 2}), (v3:V {id: 3}) 
WITH CASE WHEN true = true THEN v1 ELSE v2 END as t // placeholder for any conditional logic
CREATE (t)-[:E]->(v3);

Error message for last statement:

Error: Binder exception: Cannot bind t as node pattern.