mgaare / cumulusrdf

Automatically exported from code.google.com/p/cumulusrdf
0 stars 0 forks source link

Evaluate entity queries via a single scan #27

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Currently, entity queries, e.g,. 

?x knows ?y .
?x name "x" .
?x age "18" .

are evaluated via joins along their subject (in the above example: ?x). That 
is, one would need to compute bindings for each triple patten, and join them 
using two equi-joins. 

However, this (probably) could be done much more efficiently with a single 
scan. That is, one would start with a scan of the pattern with the least 
matches (e.g., ?x age "18"):

x1 age "18" --> scan for x1 ?p ?o
x2 age "18" --> scan for x2 ?p ?o
x3 age "18" --> scan for x3 ?p ?o
...

Each such scan (x1 ?p ?o) would result in additional property/object pairs - 
these could be pushed to subsequent triple pattern accesses. For instance,  

"x1 ?p ?o" could find "x1 knows y1", "x1 knows y2", "x1 name "x"" ... The 
former two triples could be pushed to access ?x knows ?y, the latter triple 
("x1 name "x") to pattern access for ?x name "x".

The key advantage is really that scans (sorted accesses) are fairly cheap, in 
comparison to random access probes. Thus, when finding the first potential 
result entity (e.g, x1), we could just scan over (all) its associated triples 
...

- Andreas

Original issue reported on code.google.com by andreas.josef.wagner on 29 Jan 2014 at 9:33

GoogleCodeExporter commented 9 years ago
Note, actually this applies to any star-shaped query - not just entity queries.

- Andreas

Original comment by andreas.josef.wagner on 29 Jan 2014 at 9:39