Closed hartig closed 3 years ago
I am done with extending the spec to cover the annotation syntax for SPARQL*.
Preview: https://pr-preview.s3.amazonaws.com/w3c/rdf-star/pull/65.html
The parts that I have extended are:
ObjectList
production and a new production for annotation patterns)Please take a look.
/cc @pchampin @afs @gkellogg
Note that, in the existing grammar, EmbTP
is really not strict enough:
[174] EmbTP ::= '<<' EmbSubjectOrObject Verb EmbSubjectOrObject '>>'
[175] EmbSubjectOrObject ::= Var | BlankNode | iri | RDFLiteral | NumericLiteral |
BooleanLiteral | EmbTP
An EmbSubjectOrObject
includes literals, and can't exist as the subject of any triple. I think previously, we had VarOrBlankNodeOrIriOrEmbTP
and VarOrTermOrEmbTP
for subject and object, which have appropriate restrictions.
[107s] VarOrBlankNodeOrIriOrEmbTP ::= Var | BlankNode| iri | EmbTP
[176] VarOrTermOrEmbTP ::= Var | GraphTerm | EmbTP
SPARQL allows literals as subjects. They just never match.
They arise naturally - most clearly, with reverse paths.
A triple pattern is "(RDF-T ∪ V) x (I ∪ V) x (RDF-T ∪ V)"
https://www.w3.org/TR/sparql11-query/#sparqlTriplePatterns
VarOrTerm
seems the place to add them because GraphTerm
is without variables.
Thanks, @afs, if I was aware of that, I've since forgotten.
I think we need a change to ObjectListPath
as well:
[86] ObjectListPath ::= ObjectPath AnnotationPattern? ( ',' ObjectPath AnnotationPattern? )*
When I try the following example, that branch is hit in my parser, at least:
PREFIX : <http://bigdata.com/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.org/>
SELECT ?age ?c WHERE {
?bob foaf:name "Bob" {| ex:certainty ?c |}.
}
@gkellogg what do you mean by "hit in my parser"? What's wrong with the query? (I don't think anything.)
What I meant was, that when I parsed that example the parser took the path including ObjectListPath
instead of ObjectList
. There are parallel paths through the grammar with and without Path
. In this case, it seems to be because the path through the parser is the following:
Query
SelectQuery
WhereClause
GroupGraphPattern
GroupGraphPatternSub
TriplesBlock
TriplesSameSubjectPath
PropertyListPathNotEmpty
ObjectListPath
I believe the ObjectList
production is used in CONSTRUCT
, and ObjectListPath
in WHERE
.
Thanks Greg! In fact, the PropertyListPathNotEmpty
production in the original SPARQL 1.1 grammar uses both, ObjectList
and ObjectListPath
.
[83] PropertyListPathNotEmpty ::= ( VerbPath | VerbSimple ) ObjectListPath ( ';' ( ( VerbPath | VerbSimple ) ObjectList )? )*
Hence, in addition to extending the ObjectList
production (as done in my PR so far), we also need to extend the production ObjectListPath
as follows:
[86] ObjectListPath ::= ObjectPath AnnotationPattern? ( ',' ObjectPath AnnotationPattern? )*
In this context, I have also discovered another error in the current SPARQL* grammar: the production ObjectPath
has to be extended as well!
[87] ObjectPath ::= GraphNodePath | EmbTP
I will add these two extensions to the grammar to this PR.
In this context, I have also discovered another error in the current SPARQL* grammar: the production
ObjectPath
has to be extended as well![87] ObjectPath ::= GraphNodePath | EmbTP
Actually, this isn't required and causes a First/First conflict in my parser generator: GraphNodePath
is defined as the following:
[105] GraphNodePath ::= VarOrTermOrEmbTP | TriplesNodePath |
(Note extra |
at the end, which is an error). So, VarOrTermOrEmbTP
already covers the EmbTP
case.
Sorry Greg. My bad.
I have fixed the issue in the grammar now (see commit https://github.com/w3c/rdf-star/pull/65/commits/ecac9c4c6869ac0a7ab3c8ede65d64de2a29351d).
Annotations and paths:
ObjectList
(used in template for CONSTRUCT and in SPARQL Update) is fine.
(aside: Unlike Turtle, it is possible to add to Object
and ObjectPath
because Collection
uses GraphNode
, not Object
but for the moment, let's stick to ObjectList*
)
For ObjectListPath
some forms can not be a syntax rewrite to <<>>
and would need a change to evaluation - can only do the {| |}
after you know the triple in the path.
:s :p* :o {| :pp :oo |}
:o ^:p :s {| :pp :oo |}
:o !:p :s {| :pp :oo |}
:s :p/:q :o {| :pp :oo |}
:s (:p|:q) :o {| :pp :oo |}
The grammar is quite dependent on Path
being recursive and including a single term as a path element.
One option is a text note saying "If annotation, must be simple path" or slightly more ambitiously, include trailing /
case.
Andy, you are right. I did not consider property path patterns. That's a problem.
Now, that you point out this problem, I would even say that it is a bad idea in general to mix property path patterns and the annotation syntax. The idea of property path patterns is to match paths (including their respective endpoints). RDF* is not about annotations of such paths but about annotations of single triples. In this sense, combining the annotation syntax with property path patterns does not seem to make much sense at all.
So, the question is whether there is an easy way to modify and extend the grammar such that the resulting grammar forbids combining property path patterns with the annotation syntax? If not, we may have to add an explicit note in the text.
A lookahead on paths to distinguish property and path cases may be possible. Investigation required. SPARQL is designed to be parser-simple - it's plain LL(1) (and LALR(1)) so that the widest range of compiler tools can be easily used.
I'm keen to make the changes localised to keep the barrier to adoption low.
There is another implication with
:s :p :o {| :pp :oo |}
The embedded triple term is not available in a variable. Probably have to live with that; some things will require << >>
usage.
SPARQL is designed to be parser-simple [...] I'm keen to make the changes localised to keep the barrier to adoption low.
Yes, that's what I actually meant by "an easy way."
There is another implication with
:s :p :o {| :pp :oo |}
The embedded triple term is not available in a variable. Probably have to live with that; some things will require << >> usage.
Right. In fact, for this purpose, just using << ... >>
instead of the annotation syntax is not sufficient either. You would have to use the SPARQL* version of BIND instead. For instance, by assuming the original PG-mode-based evaluation semantics of BIND (as defined in my original paper), the corresponding query would be:
SELECT ?t WHERE {
:s :p :o .
BIND( <<:s :p :o>> AS ?t )
?t :pp :oo .
}
...and by assuming the evaluation semantics of BIND as defined in our spec now, the query would be:
SELECT ?t WHERE {
BIND( <<:s :p :o>> AS ?t )
?t :pp :oo .
}
This was discussed during today's call: https://w3c.github.io/rdf-star/Minutes/2020-12-18.html#item02
I have tried to find a simple solution to extend the grammar in a way such that it permits the annotation syntax only in triple patterns but not in property path patterns, where "simple" means something that does not require either changing major parts of the existing grammar or parsers that can look ahead more steps than what is needed with the existing grammar. After looking again at the existing grammar in detail, I don't think that such a solution exists :-(
Therefore, my proposal is to keep the grammar extension as specific in this PR and add a note that specifies the restriction in text form (similar to the notes in Section 19.8 of the SPARQL 1.1 spec).
https://github.com/apache/jena/blob/main/jena-arq/Grammar/main.jj has the annotation extension added for object
and objectpath
and, yes, it uses a grammar note to limit the use to for paths to simple links.
The alternatives look complicated: either additional lookahead of the path production (I haven't checked that works because assuming it impacts which parser generators can be use) or split path into compound and simple cases which becomes a wide spread change in the grammar.
(I may even be able to produce a complete grammar if the toolchain for producing HTML still works after all this time).
@hartig if you rebase the PR branch on main (might not be pretty), the preview stuff should work again. You'll need to rebase in any case to resolve the conflicts.
Gregg, is this rebasing something that can be done automatically or do I have to do it manually?
I’m afraid rebasing is manual. However, you should be able to just merge main into your branch, which may be less clean, but will get the job done.
rebasing is one of the most difficulty and unintuitive parts of Git, IMO.
Thanks. I have never done such a rebasing before. Hence, my question.
Perhaps, in this case, it will be easier and less time consuming if I simply create a new branch from main, copy the changes over, and generate a new PR (as I had done with the SPARQL-star Update PR).
I have copied these changes into a new PR that I have created from the current main branch. See #106
I am closing this PR here.
This PR is meant to address the SPARQL* part of #9
Preview | Diff