eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 163 forks source link

FedX: Missing limit pushing of ASK queries with single statement pattern causes poor query performance #5033

Closed aschwarte10 closed 3 months ago

aschwarte10 commented 3 months ago

Current Behavior

Currently for ASK queries having a single statement pattern the limit 1 is not pushed into the sub query, while it is implement for select queries.

The effect is poor behavior.

Example:

ASK { ?person a foaf:Person }

Currently the implementation of the federation will fetch all persons available in the federation members (and locally checks that there is at least one binding), though we are just interested in the existence

Note: for SELECT queries FedX already has an optimizer that pushes the limit for such trivical cases into the sub-query.

Expected Behavior

The limit is pushed and the same optimizations as for SELECT queries with a single statements is applied.

Steps To Reproduce

To seee poor performance, run an ASK query on a large database with millions of instances.

The difference in the query plan is the "Upper Limit: N" attached to the statement pattern

QueryRoot
   Slice (limit=1)
      StatementSourcePattern
         Var (name=person)
         Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
         Var (name=_const_e1df31e0_uri, value=http://xmlns.com/foaf/0.1/Person, anonymous)
         StatementSource (id=sparql_localhost:18080_repositories_endpoint1, type=REMOTE)
         StatementSource (id=sparql_localhost:18080_repositories_endpoint2, type=REMOTE)
         Upper Limit: 1

Version

4.3.12

Are you interested in contributing a solution yourself?

Yes

Anything else?

No response