w3c / sparql-dev

SPARQL dev Community Group
https://w3c.github.io/sparql-dev/
Other
121 stars 19 forks source link

Property paths over graph boundaries #74

Open namedgraph opened 5 years ago

namedgraph commented 5 years ago

Why?

In SPARQL 1.1 property paths work only in a single graph: https://www.w3.org/TR/sparql11-query/#sparqlPropertyPaths

A property path does not span multiple graphs in a dataset.

Previous work

Usually addressed with a union graph that merges all named graphs.

Proposed solution

I don't have any.

Considerations for backward compatibility

N/A

cygri commented 5 years ago

If the graphs involved can be enumerated before evaluation of the path, then one solution would be ad-hoc union graphs:

SELECT * {
    ...
    GRAPH <g1> <g2> <g3> {
        ?x :p+ ?y
    }
}

The GRAPH clause with multiple arguments would evaluate the pattern against the union of the graphs named by the arguments.

namedgraph commented 5 years ago

@cygri off topic - how did you highlight the SPARQL syntax? :)

cygri commented 5 years ago

@namedgraph

```sparql
SELECT ...
```
jaw111 commented 2 years ago

How about allowing the FROM clause to be used in a subselect.

SELECT * {
    ...
    {
        SELECT *
        FROM <g1>
        FROM <g2>
        FROM <g3>
        WHERE {
            ?x :p+ ?y
        }
    }
}

That would also be handy when making a federated request (as would suggestion of @cygri).

daniel-hugo commented 1 year ago

@namedgraph, this looks to me like a contradiction in the SPARQL 1.1 specification... yes, 18.1.7 says "A property path does not span multiple graphs in a dataset", but 18.4 also says "All evaluation is carried out by matching the active graph at that point in the overall query evaluation." Where the active graph is the merge of multiple graphs, this would actually require property paths to span multiple graphs. Experimenting with Blazegraph, it looks like they have obeyed 18.4 and disregarded 18.1.7...

# assume quads mode is enabled
INSERT DATA {
  GRAPH <g1:> { <a:> <p:> <b:> }
  GRAPH <g2:> { <b:> <p:> <c:> }
}
SELECT DISTINCT * WHERE { ?x <p:>+ ?y }

...produces...

 ?x   ?y
<a:> <b:>
<b:> <c:>
<a:> <c:>

I would not expect to see that third result if property paths do not span multiple graphs. You also get one result, , for the (un-quantified) property path ?x / ?y. How many other implementations behave the same way? Because of the usefulness of this behaviour, SPARQL 1.2 should delete the "property path does not span multiple graphs" statement (and all the more so if many/most implementations already behave like Blazegraph).

However, I'm not saying there is no issue to address here! Changing grammar rule 58 (GraphGraphPattern) from 'GRAPH' VarOrIri GroupGraphPattern to 'GRAPH' VarOrIri+ GroupGraphPattern to permit queries to specify multiple graph IRIs for GRAPH blocks would be very useful for many reasons, not only for specifying the active graph evaluated by a property path.

afs commented 1 year ago
... WHERE { ?x <p:>+ ?y }

That's a query on the default graph.

A SPARQL evaluation of a path match acts on a graph in the dataset given the evaluation. Where that dataset came from isn't part of SPARQL evaluation. (FROM..FROM.. does not change that because that is a "dataset description" that is used to establish the dataset for the query evaluation.)

So in evaluation, the spec deals in "graphs" - a set of triple. Whether the default graph is from other graphs is not detectable in a query evaluation. It could involve the dynamic-merge or copy-merge, or has inference, or the intersection of two other graphs, or has security restrictions restricting the view , or ...).

daniel-hugo commented 1 year ago

Yes, I realised afterwards the paradox in my sentence "Where the active graph is the merge of multiple graphs..." meaning there isn't a contradiction of the spec.

I also realised that what I said about the GraphGraphPattern would only work if all variables used to indicate graphs were already bound at that point in query evaluation (i.e. it couldn't bind newly mentioned variables, but that's part of how the GRAPH keyword is intended to work currently).