kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.38k stars 97 forks source link

SHORTEST path restriction #2855

Open parsecgames opened 8 months ago

parsecgames commented 8 months ago

Using shortest on a multi rel query such as

MATCH (f:Tab)<-[:Rel* SHORTEST 0..]-(t:Tab)-[:Err*1]->(i:Tab) 
WHERE EXISTS {MATCH (t)<-[:Rel* SHORTEST 0..]-(ti:Tab) WHERE ti.name = $tname } RETURN *

raises a runtime error RuntimeError: Binder exception: Lower bound of shortest/all_shortest path must be 1.

For my use case, shortest makes perfect sense.

From documentation: Further notes on shortest path We force the lower bound of shortest path to be 1 to avoid ambiguity.

The request is to remove that limit.

semihsalihoglu-uw commented 8 months ago

I don't fully understand the question but the error is just saying change the 0 in this part [:Rel* SHORTEST 0..] to 1. A Path of length 0 is not a valid path, so we need at least 1 edge to be in the path.

parsecgames commented 8 months ago

In this case, I want f or t, whichever is the shortest and only that. How else can that be achieved?

Also I want $tname to match t or ti whichever is closest.

semihsalihoglu-uw commented 8 months ago

I would try to use CASE statement that checks the lengths of the variable lengths. I probably would not put the variable-length path to ti in WHERE EXISTS. Instead, maybe directly put into MATCH. Something like this:

MATCH (f:Tab)<-[e1:Rel* SHORTEST 1..]-(t:Tab)-[:Err*1]->(i:Tab), (t)<-[e2:Rel* SHORTEST 1..]-(ti:Tab) 
WHERE ti.name = $tname
RETURN 
  CASE 
     WHEN length(e1) < length(e2) THEN t
     ELSE ti
  END AS s;

You can project more than t and ti.

parsecgames commented 8 months ago

thanks! I'll try that

parsecgames commented 8 months ago

This doesn't do what the og query does. The only alternative I can think of is to recompile kuzu and comment out the check.

semihsalihoglu-uw commented 8 months ago

Can you say a bit more? What do you mean by the "og query"? What's the query trying to compute. I wouldn't edit Kuzu to run a query. It will likely not work and compute the right thing. It's better for us to understand exactly what you're trying to do and what the right query in Cypher would be.

semihsalihoglu-uw commented 8 months ago

Also, it would be much more efficient to iterate on this if you came to Discord: https://discord.gg/VtX2gw9Rug.

parsecgames commented 8 months ago

Consider this graph as a simplified version of what this query is suppose to do: a drawio

Here I want to be able to find shortest path around Rel2 which may have Rel1 at either side or not.

for start node node1 and end node node6, this should return node1 as the start base and node4 as the end base node. for start node node2 and end node node6, this should return node5 as the start base and node4 as the end base node.

The example query that I could come up with is something like this:

MATCH (sn:Node)<-[:Rel1* SHORTEST 0..]-(sbn:Node)-[:Rel2*1]->(ebn:Node)-[:Rel1* SHORTEST 0..]->(en:Node)
WHERE sn.id = $start  AND en.id = $end
RETURN sbn.id, ebn.id
semihsalihoglu-uw commented 8 months ago

I can think of two options. But before that: I think you need to be careful about whether you want the edge direction in the (sn:Node)<-[:Rel1* SHORTEST 0..]-(sbn:Node) pattern to go from sn to sbn or vice versa. For the 2 examples you gave, you seem to want different things. In start=1, end=6, you seem to want the edge to from sbn to sn. In the start=2, end=6 example you want it to go from sn to sbn. I'll give my answer based on wanting these to from sn to sbn.

Option 1: 4 separate queries. What you are asking for is a type of regular path query, where you seem to have a regex in mind on the labels of the edges in the paths you want to match. Regular path queries are not supported in Cypher, so you can expand the query to 4 possible different ways it can match (2 possibilities about whether the sbn can have an optional [Rel1] pattern to the left X 2 possibilities for whether ebn can have an optional [Rel1] pattern to the right):

MATCH (sn:Nodes)-[:Rel2*1]->(en:Nodes)
WHERE sn.id = $start AND en.id = $end
RETURN sn.id, en.id

MATCH (sn:Nodes)-[:Rel1* SHORTEST 1..]->(sbn:Nodes)-[:Rel2*1]->(ebn:Nodes)
WHERE sn.id = $start  AND ebn.id = $end
RETURN sbn.id, ebn.id

MATCH (sbn:Nodes)-[:Rel2*1]->(ebn:Nodes)-[:Rel1* SHORTEST 1..]->(en:Nodes)
WHERE sbn.id = $start  AND en.id = $end
RETURN sbn.id, ebn.id

MATCH (sn:Nodes)-[:Rel1* SHORTEST 1..]->(sbn:Nodes)-[:Rel2*1]->(ebn:Nodes)-[:Rel1* SHORTEST 1..]->(en:Nodes)
WHERE sn.id = $start AND en.id = $end
RETURN sbn.id, ebn.id

Again this won't return exactly the answers in your examples. Specifically, this won't return 5, 4 when start=2, and end=6 but I think it's because your answers are not accurate. If you change the direction of the sbn to sn path, then it will return 5, 4.

Option 2: Single query: Or you can use WHERE EXISTS and merge those 4 queries into a single query.

MATCH (sbn:Nodes)-[:Rel2*1]->(ebn:Nodes)
WHERE (sbn.id = $start OR EXISTS { MATCH (sn:Nodes {id: $start})-[:Rel1* 1..3]->(sbn) })  AND (ebn.id = $end OR EXISTS { MATCH (ebn:Nodes)-[:Rel1* SHORTEST 1..3]->(en:Nodes {id: $end}) })
RETURN sbn.id, ebn.id

Hope this helps. Here are the create statements I used to test these queries:

create node table Nodes(id INT64, primary key (id));
create rel table Rel1(from Nodes to Nodes);
create rel table Rel2(from Nodes to Nodes);
create (n1:Nodes {id: 1})-[r1:Rel1]->(n2:Nodes {id: 2})-[r2:Rel1]->(n5:Nodes {id: 5});
match (n1:Nodes {id: 1}) create (n1)-[r2:Rel2]->(n4:Nodes {id: 4});
match (n5:Nodes {id: 5}), (n4:Nodes {id: 4}) create (n5)-[r2:Rel2]->(n4);
match (n4:Nodes {id: 4}) create (n3:Nodes {id: 3})-[r1:Rel1]->(n4);
match (n4:Nodes {id: 4}) create (n4)-[r1:Rel1]->(n6:Nodes {id: 6});
parsecgames commented 8 months ago

Thanks for pointing out the mistake. I made the diagram in a hurry so direction is reversed for the last Rel1.

Also huge thanks for the time and effort spent on the suggestions! Will try them and get back.