Open rat10 opened 3 years ago
Maybe "uncharted territory" is too pessimistic. It is well defined that an RDF graph is a set of RDF triples. We could provide two properties:
occurrenceOf
inGraph
Even in the absence of a proper term in the RDF vocabulary that denotes a graph we could informally advise that the range of inGraph is an IRI pointing to a set of triples e.g. in a document or a named graph. We are just addressing a set of triples via a containment relation, so no need to get into discussions about what that set means or entails or about what a dataset that eventually contains it means or entails etc etc. What could possibly go wrong?!
I am in favor of adding such a vocabulary.
However, @rat10, you end your description of this issue with the question: "What could possibly go wrong?!" Now, this makes me wonder what the purpose of adding this question is. Is this meant to be a rhetorical question? Do you forsee anything that might "go wrong" if we define such a vocabulary?
@hartig The sentence "What could possibly go wrong?!" is rethorical and meant to express that "I'm not sure if I have thought this through sufficiently. Right now I don't see any problems but this area is notorious for non-obvious problems."
For example, as I come to think of it: how do we address a triple in the default graph of a dataset? Maybe:
_:x rdfx:occurrenceOf << :a :b :c >>;
rdfx:inGraph :SomeDatasetIRI .
Would that be correct?
As this is likely a common need: << :a :b :c >> rdfx:occursIn :SomeGraphIRI .
@afs Renaming the property rdfx:inGraph
to rdfx:occurrsIn
we get
_:x rdfx:occurrenceOf << :a :b :c >>;
rdfx:occursIn :SomeGraphIRI .
and
<< :a :b :c >> rdfx:occursIn :SomeGraphIRI .
So the rdfs:domain
of rdfx:occursIn
can be a triple term as well as an IRI or balnk node. Wouldn't that cover both use cases?
[EDIT:] However this could lead to misunderstandings as
_:x rdfx:occurrenceOf << :a :b :c >> .
<< :a :b :c >> rdfx:occursIn :SomeGraphIRI .
might give the impression that _:x
occurs in :SomeGraphIRI
athough it does neither confirm nor refute such an assumption.
Multiple rdfs:domain
are combined as "and", not "or".
A nuisance (and schema.org differs by design.)
We could leave it unstated otherwise what is the domain of rdfx:occurrenceOf
? Does it include the type of triple terms or is it "usage of"?
My suggestion is not to replace the pair - it is to have a way of directly stating a common case without the blank node being needed.
@afs
Multiple
rdfs:domain
are combined as "and", not "or". A nuisance (and schema.org differs by design.)
Hm, didn't know that... So we couldn't properly define the domain anyway as it is blank node or IRI.
We could leave it unstated otherwise what is the domain of
rdfx:occurrenceOf
? Does it include the type of triple terms or is it "usage of"?
I don't understand. In my understanding the domain of rdfx:occurrenceOf
can only be a triple term. What do you mean with "or is it 'usage of'"?
My suggestion is not to replace the pair - it is to have a way of directly stating a common case without the blank node being needed.
This argument I don't get. Your use case is still covered by my modification and still doesn't need a blank node. In fact I didn't touch your use case at all but changed only the other use case of defining an identifier for the occurrence. That doesn't necessarily need a blank node but some sort of identifier (obviously, as defining such identifier to be able to say things about such an occurrence is the whole purpose).
So we couldn't properly define the domain anyway as it is blank node or IRI.
It does not matter about blank nodes or IRIs.
:p rdfs:domain :A .
:p rdfs:domain :B .
then the subject of :p
must be both an A and a B not because of rdfs:domain
but because it's two asserted statements.
Multiple
rdfs:domain
are combined as "and", not "or". A nuisance (and schema.org differs by design.)
This can be addressed with OWL, although it seems increasingly left to the pedantic to do so. Still, I think it's appropriate to have predicates that define their domain/range as being some kind of embedded triple. (RDF should have created a type to allow a resource used as a graph name to have a range of Graph, too, IMHO).
We could leave it unstated otherwise what is the domain of
rdfx:occurrenceOf
? Does it include the type of triple terms or is it "usage of"?My suggestion is not to replace the pair - it is to have a way of directly stating a common case without the blank node being needed.
No opinion.
So we couldn't properly define the domain anyway as it is blank node or IRI.
It does not matter about blank nodes or IRIs.
:p rdfs:domain :A . :p rdfs:domain :B .
then the subject of
:p
must be both an A and a B not because ofrdfs:domain
but because it's two asserted statements.
:p rdfs:domain [ a owl:unionOf (:A :B) ] .
Could be used for some private class, but, in this case, if you had :p rdfs:domain :A
you could extend the type of a given value be a union of :A
and some other class. But, this modeling could be missed by people creating graphs using the property. Using schema:domainIncludes
avoids these problems, at the loss of some inference. Maybe the RDFS vocabulary needs such properties.
First of all, I am in favour of introducing such a vocabulary.
However, the example above is flawed. A graph does not contain occurrences. It is itself a mathematical abstraction, and contains (abstract) triples. Two graphs containing a triple in common contain the same triple; two graphs containing exactly the same triples are actually one and the same graph.
The RDF 1.1 Concepts spec has a dedicated section where it defines the notion of RDF source, which is, in my view, a better candidate for containing triple occurrences.
Actually, maybe we could take this opportunity to mint IRIs for these concepts as well. Something like:
x:Graph a rdfs:Class.
x:Source a rdfs:Class.
x:Triple a rdfs:Class.
x:TripleOccurrence a rdfs:Class.
x:inGraph a rdf:Property;
rdfs:domain x:Triple;
rdfs:range x:Graph.
x:inSource a rdf:Property;
rdfs:domain x:TripleOccurrence;
rdfs:range x:Source.
x:hasState a rdf:Property;
rdfs:domain x:Source;
rdfs:range x:Graph.
x:hasOccurrence a rdf:Property;
rdfs:domain x:Triple;
rdfs:range x:TripleOccurrence.
@pchampin I had thought that this is exactly the kind of semantic ratholes that we don't want to go into. In this context I couldn't care less if a graph is understood as a mathemetical abstraction or as a snippet of RDF in some Turtle file. If I can refer to it by an IRI or a blank node it is up for grabs and I can describe that it contains a given triple, as an occurrence.
I wonder if I can close the Pandora box again that I opened with my careless talk of domain and range. Fact is I made a basic mistake anyway: there are no terms in the RDF vocabulary for blank nodes nor IRIs. So I think we should just leave the domain and range formally undefined and be done with it. The informal description is: every set of triples that is adressable by blank node or IRI is fair game.
Regarding the vocabulary that you propose: we wouldn't want to define so many terms related to the RDF core in the x namespace, and we also wouldn't want to define them in the RDF namespace as that would seem rather encroaching. I propose to leave this extension very low key: two properties, an informal description, and be done with it for this round.
As suggested by @hartig I'm moving the following discussion here: in #209 @hartig introduces a concrete proposal how users can indicate that a property is a so-called transparency-enforcing property (i.e., quoted triples are meant to be referentially transparent when used in nested triples with such a property; see example in the new text)
. That proposal works on properties. However as evidenced e.g. by the use cases most of the time we need referentially transparent occurrences. It seems like mixing two orthgonal approaches if to define a reference to a referentially transparent occurrence one has to work both on occurrences and on properties. It's not impossible but it seems twisted. It also introduces the possibility of undesired effects if one wants to use said property on both referentially opaque and transparent occurrences. IMO it would at least in some (probably most) use cases be better if referential transparency could be defined per occurrence. A property (please ignore for now the clumsy wording)
:referentiallyTransparentOccurrenceOf
could define an occurrence as being referentially transparent if, per Olaf's suggestion, the property was declared as transparency enabling:
:referentiallyTransparentOccurrenceOf rdf:type rdf-star:TransparencyEnablingProperty .
Extending the occurrence vocabulary in this sense and adding the type declaration to the axiomatic triples of RDF-star seems like a good idea to me.
While the example
_:a :occurrenceOf << :s :p :o >> ;
:in <file1.ttl> ;
dct:creator :alice.
refers to an occurrence of the quoted triple the following example would refer to the interpreted representation of the quoted triple:
_:b :referentiallyTransparentOccurrenceOf << :s :p :o >> ;
:in <file1.ttl> ;
dct:creator :alice.
Under OWL entailment and in presence of another statement
:s owl:sameAs :s2
we would be able to entail that
_:b :referentiallyTransparentOccurrenceOf << :s :p :o >> ;
:referentiallyTransparentOccurrenceOf << :s2 :p :o >> ;
:in <file1.ttl> ;
dct:creator :alice.
@rat10 when carrying over your comment from PR #209 (i.e., https://github.com/w3c/rdf-star/pull/209#issuecomment-926141392) to here, you forgot to include the proposal to extend the vocabulary with the following statement, which I agree would be a natural thing to state.
:referentiallyTransparentOccurrenceOf rdfs:subPropertyOf :occurrenceOf .
However, one thing that is still missing in your proposal is a definition of the semantics of the :occurrenceOf
property. Do you have a proposal for that one or do you suggest we leave it undefined?
I didn't forget but found it premature. My first goal was to establish if and how individual occurrences can be declared as referentially transparent. We seem to agree on a mechanism to achieve that. Now the fine tuning begins.
The term occurrence is used in RDF to refer to a referentially transparent statement as described by the standard reification vocabulary. As this IMO is also a plausible semantics I'd like to leave it that way, introducing as little disruption as possible. But a reference to quoted occurrences seems desirable too. A possible solution would be to define :occurrenceOf
as TEP and introduce a further property :quoteOf
to refer to referentially opaque occurrences, both as per EXAMPLE 8 to be used together with a second statement using the :in
(or maybe :inSource
, but that's a further discussion) property to describe their location. Another possibility would be to leave the referential semantics of :occurrenceOf
unspecified and define two subproperties :quoteOf
and :interpretationOf
but that seems a but much...
The term occurrence is used in RDF to refer to a referentially transparent statement as described by the standard reification vocabulary. As this IMO is also a plausible semantics I'd like to leave it that way, introducing as little disruption as possible. But a reference to quoted occurrences seems desirable too. A possible solution would be to define :occurrenceOf as TEP and introduce a further property :quoteOf to refer to referentially opaque occurrences, both as per EXAMPLE 8 to be used together with a second statement using the :in (or maybe :inSource, but that's a further discussion) property to describe their location.
Can you make a concrete proposal for these definitions?
FWIW, I would be in favor of leaving the semantics of :occurenceOf
unspecified (I can't think of any semantic constraint that should be imposed on them).
I find the naming :quoteOf
a little odd, since _:x :quoteOf <<:s :p :o>>
would mean "_:x is the quote of a quoted triple"...
I don't mind defining a transparency-enabling version of :occurenceOf
, which I would propose to call :statingOf
[1]. I am not sure this should be a subproperty of :occurenceOf
, though, as statings and triple occurences are different beasts IMO.
[1] https://lists.w3.org/Archives/Public/www-rdf-interest/1999Dec/0068.html
FWIW, I would be in favor of leaving the semantics of
:occurenceOf
unspecified (I can't think of any semantic constraint that should be imposed on them).
I find the property relatively useless if its semantics had such a gaping hole and it would be a shame to waste the term on it.
I find the naming
:quoteOf
a little odd, since_:x :quoteOf <<:s :p :o>>
would mean "_:x is the quote of a quoted triple"...
Yes, that's right. The naming is not yet perfect. <<:s :p :o>> is a quoted triple and the property name :quoteOf
doesn't capture the transformation from triple to occurrence.
I don't mind defining a transparency-enabling version of
:occurenceOf
, which I would propose to call:statingOf
[1]. I am not sure this should be a subproperty of:occurenceOf
, though, as statings and triple occurences are different beasts IMO.
The term stating
does in my intuition have the connotation of asserting (as in asserted vs unasserted statements) and thus could easily be confused with that orthogonal aspect.
[1] https://lists.w3.org/Archives/Public/www-rdf-interest/1999Dec/0068.html
That is quite some hair splitting going on there ;-) IIUC this discusses yet another aspect - the speach act of asserting an assertion as an event on its own right - and I don't think we should go there.
Let's see what we have so far: we have triples (as types) and occurrences, and we have referential transparency and opacity. The embedded triple per the proposed semantics is a referentially opaque triple (as type). The occurrences we talk about are either referentially transparent or opaque. Instead of the technical terms referential opacity and referential transparency we can also use the more figurative terms quoted and interpreted. As a result we could define two semantically well specified subproperties of :occurrenceOf
:
:quotedOccurrenceOf
:interpretedOccurrenceOf
These are not yet a nicely succinct names but at least they seem to capture with enough precision what we are talking about.
As I said above I don't see the need for a semantically underspecified :occurenceOf
property. Instead we could define the semantics of :occurrenceOf
as interpreted, referentially transparent. Thus it would:
Maybe we should leave it at that as actually I'm not sure I see the need for quoted occurrences. OTOH I'm not sure about their uselessness either ;-) So I'm still trying to come up with a better term. What about
:citationOf
as a reference to the quoted, referentially opaque occurrence? Citing something captures both that it actually happened (otherwise it would not be citation but a newly created assertion) and that it is represented verbatim. Looks good to me...
About "stating"... I agree that it might seem to imply some form of assertion, which of course is not intended. Note however that the same could be argued about rdf:type rdf:Statement
in standard reification -- and that's, I guess, the reason for Dan using that verb in the first place.
I could leave live with "occurrence" (ref-transparent) and "citation" (ref-opaque).
I would prefer, however, to have the properties in the opposite direction, i.e. from the quoted triple to the occurrence ("hasOccurrence", "hasCitation"...), because that makes it easier to use with the annotation syntax:
:lizTaylor :marriedTo :richardBurton {| rdf-star:hasOccurrence
[ :in 1964^^xsd:gYear ],
[ :in 1975^^xsd:gYear ]
|}.
About "stating"... I agree that it might seem to imply some form of assertion, which of course is not intended. Note however that the same could be argued about
rdf:type rdf:Statement
in standard reification -- and that's, I guess, the reason for Dan using that verb in the first place.I could leave with "occurence" (ref-transparent) and "citation" (ref-opaque).
I assume you meant "live" (and then call it a day ;-) But: great!
I would prefer, however, to have the properties in the opposite direction, i.e. from the quoted triple to the occurrence ("hasOccurence", "hasCittation"...), because that makes it easier to use with the annotation syntax:
:lizTaylor :marriedTo :richardBurton {| rdf-star:hasOccurence [ :in 1964^^xsd:gYear ], [ :in 1975^^xsd:gYear ] |}.
Good point. But that doesn't make the standard (non-annotation) syntax obsolete. What about having them both?
EDIT: your :hasOccurrence property lacks the :inGraph aspect. One could however let that default to the local graph, like I proposed in my latest comment on #170 w. r. t. an identifier syntax.
I assume you meant "live" (and then call it a day ;-) But: great!
yes, I meant "live". I'm not leaving anywhere :-)
I would prefer, however, to have the properties in the opposite direction, i.e. from the quoted triple to the occurrence ("hasOccurence", "hasCittation"...), because that makes it easier to use with the annotation syntax: (..) Good point. But that doesn't make the standard (non-annotation) syntax obsolete.*
of course not
What about having them both?
Of course, defining an owl:inverseOf
of rdf-star:hasOccurrence
and friends is alwats possible, if we don't mind having a larger vocabulary with some redundancy. My point was: if we keep only one direction, there is a practical argument for keeping that one.
EDIT: your :hasOccurrence property lacks the :in aspect. One could however let it default to the local graph, like I proposed in my comment #170 w. r. t. an identifier syntax.
cf. my proposal above where x:inGraph
and x:inSource
play exactly that role.
Maybe I'm missing something but IIUC your proposal for a property optimized for annotation syntax property only works when the referent to the occurrence is defined via one sole property. Therefor if that was to be an occurrence the definition would be either incomplete or could refer to the local graph.
Thinking of it: why not define the annotation syntax as refering to the referentially transparent occurrence in the local graph?
Regarding the following example:
:lizTaylor :marriedTo :richardBurton {| rdf-star:hasOccurence [ :in 1964^^xsd:gYear ], [ :in 1975^^xsd:gYear ] |}.
While I agree that the property should be defined in the direction such that it can be used with the annotation syntax, I am highly confused about the example per se. My interpretation of this snippet of Turtle-star is that the triple (:lizTaylor, :marriedTo, :richardBurton) has two occurrences where one of them is in the year 1964 and the other one is in the year 1975. Now, what does it mean for a triple to occur in a year??
Probably my confusion has to do with the fact that it is not entirely clear what the notion of an "occurrence" of a triple actually is; at least, it is not totally clear to me.
@rat10 is this example how you were envisioning how the property rdf-star:hasOccurrence
would be used?
EDIT: your :hasOccurrence property lacks the :in aspect. One could however let it default to the local graph, like I proposed in my comment #170 w. r. t. an identifier syntax.
cf. my proposal above where x:inGraph and x:inSource play exactly that role.
I am getting more and more confused by the minute. Can someone define exactly what you mean by "occurrence"; i.e., by the types of things that are meant to be used in the object position of a triple with the predicate rdf-star:hasOccurrence
. @rat10 since you are the main proponent of doing something about such "occurrences", can you give me such a definition?
Regarding the following example:
:lizTaylor :marriedTo :richardBurton {| rdf-star:hasOccurence [ :in 1964^^xsd:gYear ], [ :in 1975^^xsd:gYear ] |}.
While I agree that the property should be defined in the direction such that it can be used with the annotation syntax, I am highly confused about the example per se. My interpretation of this snippet of Turtle-star is that the triple (:lizTaylor, :marriedTo, :richardBurton) has two occurrences where one of them is in the year 1964 and the other one is in the year 1975. Now, what does it mean for a triple to occur in a year??
Probably my confusion has to do with the fact that it is not entirely clear what the notion of an "occurrence" of a triple actually is; at least, it is not totally clear to me.
@rat10 is this example how you were envisioning how the property
rdf-star:hasOccurrence
would be used?
I'm AFK right now, so just trying to clear up confusion but taking the risk that I may well create more....
I think you are correct with your observation wrt to PAs use of the :in property and I glossed over that as I thought it's an obvious glitch. If this use of :in was indeed meant to refer to the :in as defined alongside :occurrenceOf - if it is not meant to refer to ex:in but rdf-star:in so to say - then it is indeed used wrongly and maybe PA can replace it by something like ex:during. Only under that assumption my following comments on PA can be understood: that the occurrence is not completely specified and can not be specified by simple inverseOfs of :occurrenceOf and :in but only by a combination of the two in which :in defaults to a predefined value, preferably the local graph
But shouldn't such an ex:during
property be a property of the quoted triple rather than of the occurrence of the triple? (and I am assuming here that ex:during
can be a TEP)
"In" was a poor choice of term, because it seems to be part of the locution "occurrence in ..." , which was not my intention. It was not at all related to rdf-star:inGraph
or anything like that.
Another source of confusion is that I was using hasOccurrence
as a TEP here, as suggested by @rat10 above, and contrarily to my original use of it.
Consider this new example, which hopefully is clearer:
:lizTaylor :marriedTo :richardBurton {| rdf-star:hasOccurrence
[ :since "1964"^^xsd:gYear; :until "1974"^^xsd:gYear ],
[ :since "1975"^^xsd:gYear ]
|}.
Now to try and answer @hartig's question above: what kind of thing is denoted by the two blank nodes in this example? My answer would be: "(the fact of) Liz Taylor being married to Richard Burton".
I am aware that one might interpret them subtly differently, e.g. as "(the claim of) Liz Taylor being married to Richard Burton", which would make the graph above non-sensical (or at least mean something totally different).
The way I see it, we may 1) define several variants of the "transparent occurrence" property to account for those subtle differences; 2) define only one such property, and document the fact that its semantics is purposefully broad; 3) refuse all together the rathole of option 1 and the fuzziness of option 2, and leave it to the community to define their own properties for their own use-cases
Now to try and answer @hartig's question above: what kind of thing is denoted by the two blank nodes in this example? My answer would be: "(the fact of) Liz Taylor being married to Richard Burton".
But isn't the (stated/claimed) fact of "Liz Taylor being married to Richard Burton" actually be captured by the triple (:lizTaylor, :marriedTo, :richardBurton)? I have the suspicion that to you, @pchampin, and @rat10 have a different understanding of what an "occurrence" is (or, maybe it's just me who doesn't get it ;) It seems to me that your understanding is about occurrences of relationships or facts that are stated/claimed by a triple whereas @rat10 wants to be able to talk about occurrences of the triples themselves (e.g., the triple (:lizTaylor, :marriedTo, :richardBurton) that is in a particular Turtle file). I see these as different things, but I may be wrong. @rat10 can you clarify what you mean by "occurrence"; i.e., by the types of things that are meant to be used in the object position of a triple with the predicate rdf-star:hasOccurrence
.
I thought we have a common understanding of what an occurrence is and @hartig you are describing my understanding correctly. That is also the way in which the RDF specs use the term and the way it was used when we discussed this terminology with @pfps last year. I'm also puzzled by @pchampin's concerns. However it was a hectic exchange and maybe we got a little ahaed of ourselves today. I'll have to think through the options layed out above again and then maybe make a rounded proposal. My working hypotheses is that it would be great and make perfect sense in more than one way if the annotation syntax would refer to the local transparent occurrence but I fear I'll be again alone with this (which of course somehow limits my willingness to put work into this - we'll see... ).
EDIT: NB: I wrote this before reading @rat10's answer above.
My initial understanding of "occurence" was indeed: "the occurrence of a triple in an RDF source (turtle file, Triple store...)". As such, "hasOccurence" and "occurenceOf" would be non-TEPs, because an occurrence of ":superman :can :fly" must be distinguished from an occurrence of ":clark :can :fly".
In the comment above (near the end), @rat10 proposed to define
occurenceOf
as interpreted, referentially transparent" → I read this as "an occurrence of the fact described by the triple"citationOf
as a reference to the quoted, referentially opaque occurrence" → I read this as "an occurrence of the triple" as described above.I replied that I could live with it. That's the agreement that I thought we had found, but I must say that @rat10's comment above makes me doubt: I don't understand how "occurrences of facts" should be specified as being in a graph... The thing denoted by the blank node in my example, which involves Liz Taylor and Richard Burton, and started in 1964, did not occur in a graph, it occurred in Montreal.
RDF allows for "Schema Last," but problems arise when considering example data without considering what its schema might be. At a minimum, some care must be taken about entity types, relationship types, and relationship values.
A wedding is an occurrence, an event, with a datestamp, if not a timestamp. Of course, it also has a duration, but this is typically measured in hours if not minutes. Liz and Richard had two of these with each other, in 1975 and 1964.
A marriage is less of an event, and more a state of being, with a start, a duration, and an end, either by divorce or death (both of which are events, as was the wedding and each party's birth). Liz and Richard had two of these with each other, running from 1964-1974, and 1975-1976.
Today, it would not be sensible to say "Liz is married to Richard," but it would be to say "Liz was married to Richard", though this state did not pertain at the time of either of their deaths.
All of which is to say -- sometimes, what seems a simple example is not -- and likewise what seems a complex example may not be so! Keeping these sorts of things straight will help a great deal in discussing examples which are meant to bring clarity to complex discussions.
EDIT: NB: I wrote this before reading @rat10's answer above.
My initial understanding of "occurence" was indeed: "the occurrence of a triple in an RDF source (turtle file, Triple store...)". As such, "hasOccurence" and "occurenceOf" would be non-TEPs, because an occurrence of ":superman :can :fly" must be distinguished from an occurrence of ":clark :can :fly".
In the comment above (near the end), @rat10 proposed to define
* "`occurenceOf` as interpreted, referentially transparent" → I read this as "an occurrence of the fact described by the triple" * "`citationOf` as a reference to the quoted, referentially opaque occurrence" → I read this as "an occurrence of the triple" as described above.
I replied that I could live with it. That's the agreement that I thought we had found, but I must say that @rat10's last comment makes me doubt: I don't understand how "occurrences of facts" should be specified as being in a graph... The thing denoted by the blank node in my example, which involves Liz Taylor and Richard Burton, and started in 1964, did not occur in a graph, it occurred in Montreal.
It seems to me that you introduce an aspect that we have so far not been discussing at all: the long known problem of identification semantics in RDF i.e. disambiguating if a URI is used to indicate a (web) resource or denote what that (web) resource refers to, a.k.a. "social meaning", "the identity crisis of the semantic web", httpRange-14, Cool URIs etc. I would think that that problem is out of scope.
The distinction I make between :occurrenceOf
and :citationOf
is that the latter is referentially opaque while the former is referentially transparent, and nothing else. Vulgo, the latter doesn't support any entailments, the former does. In both cases the referend is a statement not as a type but as it occurs in a graph.
Practically the source of the confusion might be that your example about Liz Taylor and Richard Burton uses :in
in a totally different way than Example 8. I'd suggest that we change the name of the :in
property to :inGraph
or :inSource
to avoid such confusion in the future as :in
is just too broad and thereby invites misunderstandings. I'd prefer :inGraph
but :inSource
seems to be the more prudent approach and more capable of securing a majority.
In my understanding we already have a clear path from referentially opaque types to referentially transparent occurrences: << :s :p :o >>
is a referentially opaque type. Now paraphrasing Example 8:
_:b :occurrenceOf << :s :p :o >> ;
:inSource :someSnippetOfRDF .
_:b
is a reference to the referentially transparent occurrence of << :s :p :o >>
in :someSnippetOfRDF
. In other words _:b
refers to the meaning of << :s :p :o >>
- if there is eg an owl:sameAs statement relating :s
and :s2
then _:b
refers to << :s2 :p :o >>
as well. In contrast the proposed :citationOf
would always refer only to << :s :p :o >>
, not to any entailed co-denotations like << :s2 :p :o >>
. That should all be rather boring and all too clear by now, I hope.
The question at hand now is how the annotation syntax fits into this picture. IMO it would be wise to define the annotation syntax as refering to the referentially transparent occurrence. I suggest that for example:
:s :p :o {| :v :w ;
:y :z |}
would be expanded to
:s :p :o .
_:b :occurrenceOf << :s :p :o >> ;
:inSource <> .
:v :w ;
:y :z .
assuming that
:occurrenecOf rdf:type, rdf-star:TransparencyEnablingProperty .
The relation between the two syntaxes and their different semantics is clearly defined by the syntax sketched in Example 8 and the TEP semantics of :occurrenceOf
. Defaulting to the local graph is IMO the only sensible design for the shortcut syntax.
Applying this to the example from above we can get rid of some blank nodes as those are contained in the expansion. Rather unsurprisingly
:lizTaylor :marriedTo :richardBurton {|
:since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear ;
:since "1975"^^xsd:gYear
|}.
wouldn't meet the intended meaning as it would expand to
:lizTaylor :marriedTo :richardBurton .
_:ltrb :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear ;
:since "1975"^^xsd:gYear .
However the following snippet of annotation syntax
:lizTaylor :marriedTo :richardBurton {|
:since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear
|}.
:lizTaylor :marriedTo :richardBurton {|
:since "1975"^^xsd:gYear
|}.
would expand to the intended result
:lizTaylor :marriedTo :richardBurton .
_:ltrb1 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear .
_:ltrb2 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:since "1975"^^xsd:gYear .
and IMO this is exactly how it should be and the best way to implement multisets in RDF.
Please note that this specific problem of the same statement with different annotations is the hardest problem of all that we have to solve as it has to dance around the set based semantics of RDF (and does it the same way as RDF standard reification).
This would realign RDF-star with the semantics inherent in the seminal example and a majority of the use cases. In other words:
This would extremely mitigate a risk that I see w.r.t. to the current state of the proposed semantics: namely that the proposed semantics gets ignored by a lot of published RDF-star because it makes it so hard to express the main stream use cases. I know we have different opinions on the severity of that problem but I hope we can agree that the above proposal would make it much easier to avoid. IMO this might be good enough to prevent users from ignoring the proposed semantics(still: fingers crossed...).
Addressing nothing else here --
[@rat10] I'd suggest that we change the name of the
:in
property to:inGraph
or:inSource
to avoid such confusion in the future as:in
is just too broad and thereby invites misunderstandings. I'd prefer:inGraph
but:inSource
seems to be the more prudent approach and more capable of securing a majority.
I agree with your conclusion, :inSource
. The reason is that :inGraph
feels restrictive and implies a Named Graph, which may or may not be in play, while :inSource
is clearly more flexible and allows for any RDF Source. Whether or not this predicate propagates beyond this discussion, :inSource
will be applicable to more usage scenarios.
The distinction I make between :occurrenceOfand :citationOfis that the latter is referentially opaque while the former is referentially transparent, and nothing else.
Ok, fine by me. That corresponds to option 2 at the end of my comment above. So many different things can be considered an "transparent occurrence" of a triple (events, state of beings, statings...).
But that contradicts your proposal to automatically include an :inSource
property when expanding the annotation syntax, because not many different things can occur in an RDF source. Take your example above:
_:ltrb1 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear .
What does _:ltrb1
denote? Something that started in 1964, ended in 1974, and occurred in an RDF source??
That cannot be Liz and Richard's first marriage (it did not occur in any RDF source).
That cannot be the stating of this triple either (no RDF source existed in 1964).
From where I stand this graph is inconsistent.
More formally:
:since
and :until
have a common domain (call it :StateOfBeing
):inSource
has another domain (call it :Stating
, or :Statement
, or :Assertion
)@TallTed
I agree with your conclusion,
:inSource
. The reason is that:inGraph
feels restrictive and implies a Named Graph
I agree that :inSource
is better, but for a very different reason ;-) :inGraph
does not refer, for me, to Named Graph, but to RDF graph, which is a very abstract thing that we do not interact with directly. We interact with, for example, Turtle files or RDF/XML HTTP resources, which are RDF sources.
@rat10 and others --
Note that the second marriage of Liz & Richard ended the year after it began, so the examples above should have an :until "1976"^^xsd:gYear
, e,g. --
_:ltrb2 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:since "1975"^^xsd:gYear ;
:until "1976"^^xsd:gYear .
The distinction I make between :occurrenceOfand :citationOfis that the latter is referentially opaque while the former is referentially transparent, and nothing else.
Ok, fine by me. That corresponds to option 2 at the end of my comment above. So many different things can be considered an "transparent occurrence" of a triple (events, state of beings, statings...).
But that contradicts your proposal to automatically include an
:inSource
property when expanding the annotation syntax, because not many different things can occur in an RDF source. Take your example above:_:ltrb1 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >> :inSource <> ; :since "1964"^^xsd:gYear ; :until "1974"^^xsd:gYear .
What does
_:ltrb1
denote? Something that started in 1964, ended in 1974, and occurred in an RDF source?? That cannot be Liz and Richard's first marriage (it did not occur in any RDF source). That cannot be the stating of this triple either (no RDF source existed in 1964). From where I stand this graph is inconsistent.More formally:
* I would expect that `:since` and `:until` have a common domain (call it `:StateOfBeing`) * I would expect that `:inSource` has another domain (call it `:Stating`, or `:Statement`, or `:Assertion`) * I would expect that both domains are disjoint, which leads to a contradiction.
You are just repeating your argument. Would you please comment on the first paragraph of my answer, i.e. that the problem you point out is well known under various monikers like "identity crisis", "httpRange-14" etc and that I consider it out of scope. Solutions have been proposed i.e. in CoolURIs and such solutions would be applicable to references to occurrences like _:ltrb1
in the example above. In practice to my knowledge they are seldomly used as other means like vocabularies do provide good enough disambiguation in practice. If all such means are insufficient more explicit modelling involving some specific vcabulary can be employed.
An example of how my proposal above can be extended with more precise identification:
_:ltrb1 :occurrenceOf << :lizTaylor :marriedTo :richardBurton >>
:inSource <> ;
:denotes [ :since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear ] ;
:indicates [ :source :wikipedia ] .
But this is beyond the current task to define a way to refer to some occurrence by means of the annotation syntax and is indeed based on it. We may even decide to define subproperties of :occurrenceOf
like :indicatedByOccurrenecOf
and :denotedByOccurrenceOf
, but nonetheless we will have to define the basic mechanism of how to refer to an occurrence in the annoattion syntax first. My proposal for that is based on the standard syntax to which the problem you point out applies just as well. I would like you to comment on my proposal above as what it is and tries to be. If you want to discuss the problem of identification semantics in RDF please do so in a separate issue as it concerns some more areas of RDF-star.
@rat10 The aim to define a vocabulary for describing occurrences of triples/statements (?) based on RDF-star (i.e., the topic of this issue) and your idea to re-purpose the Turtle-star annotation syntax for capturing such descriptions more succinctly are separate things. In other words, the discussion of the vocabulary should not be intertwined with matters related to the annotation syntax.
So, to continue talking about the vocabulary, I am with @pchampin.There is still no clearly articulated understanding of what exactly the type of thing is that is meant to be used in the subject position of a triple with the predicate rdf-star:occurrence
. @pchampin's recent comment makes clear that there is ambiguity in the examples. I don't think that this ambiguity has anything today with the httpRange-14 issue or Cool URIs (after all, the discussion here is completely orthogonal to the fact that HTTP URIs can be used as Web addresses; in fact, the examples here do not even use URIs in the place that our discussion is concerned with). Even if it would, this does not mean we should throw our hands up and simply introduce the property rdf-star:occurrence
without saying what its intended domain is. Unless we have a definition of what this domain is, I don't see any value in introducing this property. @rat10 I don't think you have provided such a definition in this thread, but rather referred to mailing list discussions, etc. For the purpose of making progress here, can you please provide a concrete proposal of what the definition of the domain of rdf-star:occurrence
should look like.
@hartig
@rat10 The aim to define a vocabulary for describing occurrences of triples/statements (?) based on RDF-star (i.e., the topic of this issue) and your idea to re-purpose the Turtle-star annotation syntax for capturing such descriptions more succinctly are separate things. In other words, the discussion of the vocabulary
The vocabulary terms :occurrenceOf
and :in
have been part of the draft report for a few months now. Please explain what you think is still missing.
should not be intertwined with matters related to the annotation syntax.
However the question how occurrences are refered to in the annotation syntax came up recently and that is what my recent proposal addresses.
So, to continue talking about the vocabulary, I am with @pchampin.There is still no clearly articulated understanding of what exactly the type of thing is that is meant to be used in the subject position of a triple with the predicate
rdf-star:occurrence
.
I assume that you mean rdf-star:occurrenceOf
. If not I wouldn't know what you refer to.
@pchampin's recent comment makes clear that there is ambiguity in the examples. I don't think that this ambiguity has anything today with the httpRange-14 issue or Cool URIs (after all, the discussion here is completely orthogonal to the fact that HTTP URIs can be used as Web addresses; in fact, the examples here do not even use URIs in the place that our discussion is concerned with). Even if it would,
Trust me, it does. And your concern w.r.t. blank nodes is unfounded.
this does not mean we should throw our hands up and simply introduce the property
rdf-star:occurrence
without saying what its intended domain is. Unless we have a definition of what this domain is, I don't see any value in introducing this property. @rat10 I don't think you have provided such a definition in this thread, but rather referred to mailing list discussions,
Did I? I rather thought I had presented a reasonably succinct and self-contaimed proposal w.r.t. to the annotation syntax and the whole topic of annotating occurrences. OTOH I'm not aware of any other attempt to reconcile the original purpose of RDF* and what the proposed semantics transformed it into in an equally balanced fashion. So maybe you should take a second look at it.
etc. For the purpose of making progress here, can you please provide a concrete proposal of what the definition of the domain of
rdf-star:occurrence
should look like.
The domain of rdf-star:occurrenceOf
is rdf:Statement. I hope that answers your question and allows us to make progress.
The domain of
rdf-star:occurrenceOf
is rdf:Statement. I hope that answers your question
I think it does. Now, referring to RDF11-MT (https://www.w3.org/TR/rdf11-mt/#reification):
The subject of a reification [i.e. the subject of
rdf:type rdf:Statement
] is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax.
Can we agree that the marriage of two persons is not "a concrete realization of an RDF triple, such as a document in a surface syntax"? In which case my example above would be wrong. I agree that this could be fixed by adding yet another intermediary node (as you proposed above), but I thought you found it already too cumbersome with one such intermediary node... And what would be the benefit of inserting automatically the first intermediary node with the annotation syntax, if the user still needed to add one?
In other words, if one has to write
:lizTaylor :marriedTo :richardBurton {|
:denotes [ :since "1964"^^xsd:gYear ;
:until "1974"^^xsd:gYear ] ;
:indicates [ :source :wikipedia ]
|}
``
why not let `:denotes` and `:indicates` apply directly on quoted triples, each with their own semantics and opacity/transparency?
The domain of
rdf-star:occurrenceOf
is rdf:Statement. I hope that answers your questionI think it does. Now, referring to RDF11-MT (https://www.w3.org/TR/rdf11-mt/#reification):
The subject of a reification [i.e. the subject of
rdf:type rdf:Statement
] is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax.
Your quote is not correct. Where you let it end with a full stop it does actually continue. The correct quote is:
The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object.
and from the last sub-sentence it is clear that the distinction the text wants to make is not between denotation and indication but between triple (as type) and occurrence. Therfore according to the spec rdf:Statement
describes rather precisely an occurrence in the sense that I see useful: as occurring in some source but referring to the interpretation, not the literal representation (the latter is how the proposed semantics defines embedded triples). The spec says nothing about the orthogonal distinction between denotation and indication - certainly not in the refication section but also not in any other place IIRC. If @pfps has another take on this I hope he speaks up. Otherwise I consider your concern unfounded.
We could define a class :Occurrence
as
rdf:Statement
and:occurrenceOf
to make sure that occurrences derived from embedded triples and equipped with a notion of source are disambiguated from occurrences that are declared via the standard reification vocabulary and unfortunately lack the syntax to declare their source (and are therefore 'underspecified' as @pfps called it IIRC).
By symmetry we could define a class :Citation
as
rdf:Statement
and:citationOf
to declare referentially opaque occurrences. I'm a bit undecided about this one right now - maybe it makes things too complicated. OTOH I see no better way to cite what somebody said in a specific moment, which is indeed a not uncommon use case. Sometimes it's good to follow the principle of symmetry even if one can't envision any application. It might however not be trivial to define the formal model-theoretic semantics and I can't be of help there.
Can we agree that the marriage of two persons is not "a concrete realization of an RDF triple, such as a document in a surface syntax"? In which case my example above would be wrong. I agree that this could be fixed by adding yet another intermediary node (as you proposed above), but I thought you found it already too cumbersome with one such intermediary node... And what would be the benefit of inserting automatically the first intermediary node with the annotation syntax, if the user still needed to add one?
In other words, if one has to write
:lizTaylor :marriedTo :richardBurton {| :denotes [ :since "1964"^^xsd:gYear ; :until "1974"^^xsd:gYear ] ; :indicates [ :source :wikipedia ] |}
why not let
:denotes
and:indicates
apply directly on quoted triples, each with their own semantics and opacity/transparency?[I changed the quote marks to what I assume was your intention]
@pchampin, you have been introducing this issue, not me. I was just trying to point at a more appropriate approach to tackling it. If you think the RDF-star vocabulary should be extended to accomodate differentiations between indication and denotation, go ahead, feel free to take my stub or anything else, and make a proposal.
However I would advice against it as the problem is much bigger than references to occurrences alone. Please note that the above code snippet doesn't make any declarations if the IRIs :lizTaylor and :richardBurton (let's assume they belong to the well-known ex
namespace) are denotng or indicating their subject. So we might have stated that two webpages were married before the web existed. In that (non)sense all of RDF, including all our examples, is full of contradictions and it is a miracle that the semantic web achieved anything at all. Or rather it is the advantage of engineers over logicians that they are trained at disambiguating theoretical problems from practical ones.
To anyone still not convinced that identification on the semantic web - which is not disambiguating indication and denotation - is a very thorny problem at the very heart of the semantic web (and indeed one that you could throw at any RDF-related proposal to accuse it of creating contradictions, including the RDF specs themselves), may I suggest to read up on it in a lively treatment by Harry Halpin in his dissertation, "Social Semantics - The Search for Meaning on the Web" from 2012, Section 4.1.
@pchampin, you and me discussed this issue a few years ago on semantic-web@w3.org so I know for sure that you are very well informed about its nature and the deepness of the problem. Perhaps not surprisingly I was as unconvinced about the solution you proposed then (create two different identifiers for everything, one indiacting, the other denoting), as I am about the idea of creating a second property for all properties that you support now as the easy way from referentially opaque types to referentially transparent occurrences.
I am surprised though that you bring this problem up at this very moment when I'm proposing a solution that reconciles the proposed semantics with the "expected" semantics (the latter being the one that is inherent in all examples predating this CG) by letting each have its own syntax, complemented by a clear path from one to the other via the occurrence
vocabulary. It still needs some fleshing out but wouldn't you agree that this apprach promises to benefit everybody?
Your quote is not correct. Where you let it end with a full stop it does actually continue. The correct quote is:
The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object.
and from the last sub-sentence it is clear that the distinction the text wants to make is not between denotation and indication but between triple (as type) and occurrence.
Indeed. (btw, I consider that the type-occurrence distinction is not a topic of disagreement, that's why I omitted that part).
Therfore according to the spec
rdf:Statement
describes rather precisely an occurrence
As opposed to a type, yes.
in the sense that I see useful:
... and this is where I am not following you. See below.
as occurring in some source but referring to the interpretation, not the literal representation (the latter is how the proposed semantics defines embedded triples).
this might be your reading of "conrete realization of an RDF triple", but it is not mine. That part of the text does not refer to any interpretation (which would be required to consider the denotation of a triple, because the same triple may have different denotations in different interpretations...). OTOH the term "syntax" is explicitly used (in the following), as well as the reference to "RDF triple", which is an element of the abstract syntax.
Let's look at the following sentence from the spec: "This supports use cases where properties such as dates of composition or provenance information are applied to the reified triple." Yes, those use cases!...
The spec says nothing about the orthogonal distinction between denotation and indication
Neither did I, by the way. As far as I can tell, you came up with the term "indication" without providing a definition.
(...) Otherwise I consider your concern unfounded.
Let's agree to disagree, then.
Your quote is not correct. Where you let it end with a full stop it does actually continue. The correct quote is:
The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object.
and from the last sub-sentence it is clear that the distinction the text wants to make is not between denotation and indication but between triple (as type) and occurrence.Indeed. (btw, I consider that the type-occurrence distinction is not a topic of disagreement, that's why I omitted that part).
Yeah, one has to be careful with quotes. Pulling them out of context or shortening them without proper indication, even changing punctuation, can lead to very misleading mis-representations.
Therfore according to the spec
rdf:Statement
describes rather precisely an occurrenceAs opposed to a type, yes.
in the sense that I see useful:
... and this is where I am not following you. See below.
as occurring in some source but referring to the interpretation, not the literal representation (the latter is how the proposed semantics defines embedded triples).
this might be your reading of "conrete realization of an RDF triple", but it is not mine. That part of the text does not refer to any interpretation (which would be required to consider the denotation of a triple, because the same triple may have different denotations in different interpretations...). OTOH the term "syntax" is explicitly used (in the following), as well as the reference to "RDF triple", which is an element of the abstract syntax.
RDF 1.1 Semantics, D.1 Reification confirms my reading, any other references to syntax that the spec makes notwithstanding:
Reification is not a form of quotation. Rather, the reification describes the relationship between a token of a triple and the resources that the triple refers to.
And it continues:
The value of the rdf:subject property is not the subject IRI itself but the thing it denotes, and similarly for rdf:predicate and rdf:object. For example, if the referent of ex:a is Mount Everest, then the subject of the reified triple is also the mountain, not the IRI which refers to it.
It couldn't be any clearer.
Let's look at the following sentence from the spec: "This supports use cases where properties such as dates of composition or provenance information are applied to the reified triple." Yes, those use cases!...
I have no idea what you want to express with this quote, and comment. What I note is that it isn't specific about the referent of such provenance information: the triple itself or what it refers to - again the same ambiguity inherent to identification on teh semantic web. What I would advice to not read into it is that refication can only be used for provenance. We had a discussion with Pat Hayes last year on either the CG mailing list or semantic-web@w3.org (I don't remember precisely but I can look that up if necessary) where he cautioned that what we can't do is annotate statements to the effect of re-voking them as that would run against the monotonicity of RDF. Everything else is fair game. [EDIT: I found the mail, and I remembered it wrongly. Pat does describe the issue in more prudent terms, saying that provenance annotations are unproblematic but everything that might change the truth value of the statement being annotated has to be treated with great care.]
The spec says nothing about the orthogonal distinction between denotation and indication
Neither did I, by the way. As far as I can tell, you came up with the term "indication" without providing a definition.
You introduced the topic, I introduced the terms "indicate" and "denote" only to discuss it, and I made it very clear that I find the concern you voiced unfounded and the discussion out of place.
(...) Otherwise I consider your concern unfounded.
Let's agree to disagree, then.
That would presuppose some real discussion. So far you've not provided any useful arguments w.r.t. the topic at hand: my proposal for a semantics for the annotation syntax.
This was discussed during today's call: https://w3c.github.io/rdf-star/Minutes/2021-10-29.html#r01
After long debate it was decided that embedded triples refer to an ("abstract") triple, not to any specific occurrence. Use cases like the seminal provenance use case [0] however need to refer to a specific occurrence to document e.g. when a certain triple was inserted into a certain graph. The draft report so far contains an example [1] that shows how an occurrence can be derived from a triple, using a made up property in the notorious
example.org
namespace.The question now is: do we properly define this property as part of the RDF-star vocabulary?
Pro: this is an important use case and we should provide more than just an non-committal example. Con: the property alone is not enough. A complete solution would also have to provide a means to describe the graph in which the triple occurs. Otherwise the provided solution is just as underspecified as RDF standard reification. This however is relatively uncharted territory and touches e.g. the mined area of named graphs.
[0] https://w3c.github.io/rdf-star/cg-spec/2021-04-13.html#the-seminal-example [1] https://w3c.github.io/rdf-star/cg-spec/2021-04-13.html#occurrences-example