disabling some optimizations (for performance reasons)

w3c / sparql-dev

SPARQL dev Community Group

https://w3c.github.io/sparql-dev/

Other

124 stars 19 forks source link

disabling some optimizations (for performance reasons) #105

Open eroux opened 4 years ago

eroux commented 4 years ago

Some sparql optimizations are sometimes hindering performances, it would be nice to disable them when a pattern has been identified as wrongly optimized (by the standard optimizer).

Why?

Here's an example:

PREFIX ex: <http://example.com/>

construct {
    ?res ?resp ?reso .
}
where {
    ex:G844 ?rel ?res .
    ?res a ex:Place .
    ?res ?resp ?reso .
    FILTER (?resp IN(ex:altLabel, ex:prefLabel, ex:placeEvent, ex:placeLat, ex:placeLong, ex:placeType, ex:placeLocatedIn, ex:sameAs, ex:entityScore))
}

Where when optimizing the FILTER to a union gives a result that's slower than applying the filter on the result set. This is probably the case for many FILTERs with long INs.

Previous work

https://wiki.blazegraph.com/wiki/index.php/QueryHints

Proposed solution

I think the blazegraph solution is fine, maybe a per-optimization switch would be better.

Considerations for backward compatibility

none?

kasei commented 4 years ago

If pursued, I'd prefer a solution the didn't overload existing syntax for triple patterns and basic graph patterns.

@eroux this seems very implementation specific. Not all systems will have the same idea about what an "optimized" and "non-optimized" query are. Are you just looking for a way to generically tell the system to avoid doing any query rewriting or plan optimization? In some systems that might be possible, while in others, that might be equivalent to saying "pick a query plan at random".

VladimirAlexiev commented 2 years ago

See #71, #21

TallTed commented 2 years ago

I concur with @kasei. @VladimirAlexiev has pointed to relevant related issues.

Virtuoso, for instance, has a built-in query optimizer which generally tests multiple query execution plans before settling on one.

Virtuoso also has some pragmas that allow sophisticated query authors to provide execution hints (akin to #71) which may force, but usually guide, choice of execution plan, and/or to partially specify execution sequence (akin to #21) different than SPARQL's specified inside-out pattern.

ericprud commented 2 years ago

IMO, the best optimization would be in standardizing a left-deep semantics for SPARQL. Without optimization, bottom-up semantics make SPARQL work extra hard to produce unintuitive results. Blaze's Understanding SPARQL’s Bottom-up Semantics illustrates a few of these screw cases. Bottom-up semantics are supposed to make it easier to perform some local optimizations, but I doubt that such scoped cleverness approaches the performance as a simple left-deep execution. I think the most impactful optimizations analyze the execution plan to detect when it's safe to push variables down (i.e. a simulation of left-deep).