w3c / sparql-dev

SPARQL dev Community Group
https://w3c.github.io/sparql-dev/
Other
124 stars 19 forks source link

Specify the RDF dataset for a query based on a query #178

Open jaw111 opened 1 year ago

jaw111 commented 1 year ago

Why?

In a dataset named graphs are often used to represent different aspects of resources, see the Graph Per Aspect design pattern. If we want to query a certain 'slice' of the dataset (specifically for patterns and property paths that cross graph boundaries), the FROM and FROM NAMED clauses can be used to enumerate the IRIs of the graphs over which the query will operate, but this can prove impractical for datasets containing hundreds of thousands of named graphs.

Currently this involves running first query to identify the graph IRIs of interest, and using those results to either:

Previous work

Unknown

Proposed solution

Extend the grammar for FROM and FROM NAMED to allow SubSelect:

[16]    SourceSelector    ::=   [iri](https://www.w3.org/TR/sparql11-query/#riri) | '{' [SubSelect](https://www.w3.org/TR/sparql11-query/#rSubSelect) '}'

Whereby the subselect should project a single variable whose values are used as the graph IRIs.

For example:

SELECT *
FROM {
  SELECT DISTINCT ?g {
    graph ?g {
      ?s dc:created ?created .
      FILTER (?created > "2022-12-23"^^xsd:date)
    }
  }
}
WHERE {
  ?s a ?type .
}

Considerations for backward compatibility

Extension of the specification, so none foreseen.