kjetilk / p5-atteanx-query-cache

Experimental prefetching SPARQL query cacher, take 2
0 stars 1 forks source link

Create a custom planner for BGPs #27

Open kjetilk opened 8 years ago

kjetilk commented 8 years ago

IRC chat log:

[03:25] <kasei> hmmm... in thinking more about this, I'm worried that the join rotation feature might be incompatible with IDP planning
[03:26] <kasei> incompatible with any cost-based planning that does early pruning, actually.
[03:27] <kasei> because you'll end up pruning plans that could have been rotated after the pruning to produce the lowest cost plan.
[03:28] <kasei> not always, but it's likely enough that it's a problem.
[03:29] <kasei> however, i'm not sure IDP really makes sense for what you're doing.
[03:30] <kasei> your code is mostly focused on taking a BGP and breaking it up into tree things: SPARQLBGPs, LDFs, and Tables
[03:30] <kasei> so you don't need the complexity of determining the "best" order of triples within each SPARQLBGP, because the remote endpoint is going to do that for you.
[03:32] <kasei> so I think my recommendation would be to look at implementing a custom planner just for BGPs that:
[03:32] <kasei> * first pulls out all the triples that you can make tables for (assumed to be cheaper than anything else, right?)
[03:33] <kasei> * finds connected components in the remaining triples
[03:33] <kasei> * any component of size 1 is turned into a LDF plan
[03:34] <kasei> * remaining components are turned into SPARQLBGP plans
[03:34] <kasei> then let the existing IDP (or whatever) planner take those components and figure out what join algorithms to use to produce results for the entire BGP.
[03:38] <kasei> hope that helps.
[21:08] <KjetilK> kasei, thanks for the advice
[21:08] <KjetilK> I agree with the idea that I should implement a planner for BGPs, but not the assumption that a Table is always the best, it isn't if it results in a cartesian
[21:09] <KjetilK> but I suppose I could just return two plans in that case and let the cost decide
[21:18] <KjetilK> however, I think I'll implement the surrounding infrastructure now, and see how far it takes me
[21:46] <kasei> KjetilK: yeah, that's a good point.
[21:46] <kasei> I think you could still do that sensibly, though.
[21:47] <kasei> For every triple you can turn into a table, check if removing it from the SPARQLBGP would cause a cartesian. If not, go ahead and turn it into a table...
[21:49] <KjetilK> yup
[21:49] <KjetilK> and if yes, return both plans