kjetilk / p5-atteanx-query-cache

Experimental prefetching SPARQL query cacher, take 2
0 stars 1 forks source link

Cartesian joins sent to remote endpoints #1

Closed kjetilk closed 8 years ago

kjetilk commented 8 years ago

A current plan in the 5 TP test in suite results in the following plan:

- Hash Join { s }
-   Hash Join { s }
-     SPARQLBGP
-       Quad { ?a, <b>, <c>, <http://test.invalid/graph> }
-       Quad { ?s, <q>, ?.blank-0, <http://test.invalid/graph> }
-     Hash Join { s }
-       Table (?s, ?o)
-         {o=<http://example.org/bar>, s=<http://example.org/foo>}
-         {o=<http://example.org/baz>, s=<http://example.com/foo>}
-         {o=<http://example.org/foobar>, s=<http://example.com/foo>}
-       SPARQLBGP
-         Quad { ?s, <q>, <a>, <http://test.invalid/graph> }
-   Table (?s)
-     {s=<http://example.org/foo>}
-     {s=<http://example.org/bar>}

The cartesian join is inevitable, that's how the query was written, but a problem here is that the heavy join is sent to the remote endpoint for their processing. That may not be too nice of us. Since result will be larger than the results of the triples individually, and the computational cost on the endpoint will be large, it makes sense to avoid this situation if we can.

kjetilk commented 8 years ago

TAP output gives

Probably a bug! RHS child plans were Attean::Plan::HashJoin and AtteanX::Store::SPARQL::Plan::BGP at /home/kjetil/dev/p5-atteanx-query-cache/lib/AtteanX/IDPQueryPlanner/Cache.pm line 149.

        Right: - Hash Join { s }
-   SPARQLBGP
-     Quad { ?a, <b>, <c>, <http://test.invalid/graph> }
-     Quad { ?s, <q>, ?.blank-0, <http://test.invalid/graph> }
-   Hash Join { s }
-     SPARQLBGP
-       Quad { ?s, <q>, <a>, <http://test.invalid/graph> }
-     Table (?s, ?o)
-       {o=<http://example.org/baz>, s=<http://example.com/foo>}
-       {o=<http://example.org/foobar>, s=<http://example.com/foo>}
-       {o=<http://example.org/bar>, s=<http://example.org/foo>}

so, we could possibly merge the child BGP into the parent, but I'll wait and see if @kasei finds a general solution for this in Attean.

kasei commented 8 years ago

Considering adding something like this AtteanX::API::JoinRotatingPlanner role to allow this.

kjetilk commented 8 years ago

This is largely solved now.