Closed kjetilk closed 8 years ago
The current code is able to generate plans that have both a full SPARQL query and a broken up query. Thus, it is now entirely up to the cost model in #4 to address this problem.
The plan shown here doesn't have a cartesian product...(?) Is the issue that it has two different Quads that can't be joined in a single BGP?
Yeah, but wouldn't
SPARQLBGP
- Quad { ?o, <b>, "2", <http://test.invalid/graph> }
- Quad { ?a, <c>, ?s, <http://test.invalid/graph> }
mean that the remote endpoint has to evaluate a cartesian?
Given that the SPARQLES survey found that it is unlikely that we will get complete results for a many single-quad BGPs, and that we're not committing to a very elaborate cost model, I think we can close this. The current code will pass such a BGP to a remote endpoint, even though it means that it doesn't use a cached result. Also, a better solution to this problem would probably build on Maribel's SHEPHERD work, that hasn't been published in full yet, so this is a reasonable future work.
In #1 , we argued that cartesian joins should not be evaluated by the remote endpoint. In the test "3-triple BGP where cache breaks the join to cartesian", the situation is that there is a chain-shaped query, and the midle TP is cached:
In this case, the reason why the cartesian arises is the presence of a cached TP result. This could be a really bad thing for the remote endpoint, and it would possibly be better to evaluate the entire query remotely.
Presently, we assume that if the cache is present, it would always evaluate that part locally.