w3c / rdf-star

RDF-star specification
https://w3c.github.io/rdf-star/
Other
119 stars 23 forks source link

declaring referential opacity/transparency #170

Closed rat10 closed 2 years ago

rat10 commented 3 years ago

One of the glaring omissions of the proposed semantics is that it leaves it to the user to find out how to implement referentially transparent embedded triples, even though referential transparency is the default - or even the only? - modus operandi of the semantic web.

RDF-star should at least provide a vocabulary that allows to express which referentiality semantics the embedded triples in some piece of RDF implement. Under the proposed semantics this is indispensable to publish and exchange data with referentially transparent embedded triples. If that is not possible then the central claim that referential transparency can easily be addded on top of default opacity is void.

Experience with N3 might be useful here, e.g. the log:semantics property IIUC. AFAICT it would be necessary to be able to make explicit that some specific or all triple terms in some graph or document or dataset are referentially transparent (or opaque).

The same vocabulary could of course be used the other way round: to declare in an otherwise referentially transparent environment like the semantic web that some embedded triples are intended to be interpreted as referentially opaque.

hartig commented 3 years ago

The same vocabulary could of course be used the other way round: to declare in an otherwise referentially transparent environment like the semantic web that some embedded triples are intended to be interpreted as referentially opaque.

I don't see how this can be achieved. Can you please elaborate more on how you think this can work.

rat10 commented 3 years ago

@hartig I'm hoping that someone more familiar with N3 comments on this idea before I make a clumsy proposal. But to clarify: I think some vocabulary to declare which semantics are in use is the least we can do to help go from referentially opaque to transparent embedded triples. E.g. define the follwing property and classes

rdfx:semantics
rdfx:TripleTermRefOpaque
rdfx:TripleTermRefTransparent

and use them to e.g. state that embedded triples in some graph are referentially transparent :graph_1 rdfx:semantics rdfx:TripleTermRefTransparent or that some embedded triple is to be interpreted as referentially transparent <<:s :p :o>> rdfx:semantics rdfx:TripleTermRefTransparent

Again, this is clumsy. I didn't spend much time on thinking it through. I am however quite convinced that to not only implement referentially transparent embedded triples in local applications but also to publish and exchange them such data we need a way to declare the chosen semantics.

Once such a vocabulary is created we could however also turn the table and define embedded triples as by default referentially transparent and then use the vocabulary to declare them referentially opaque for example when we work on an explainable AI use case.

[Edit: corrected TripleTermTransparent to TripleTermRefTransparent in the two examples above]

gkellogg commented 3 years ago

In N3, log:semantics is a built-in used as a predicate. From the documentation of the term, the an IRI referencing an N3 Document, and the object is a Formula. Typically, the object will be a variable, and a new Formula is derived from the referenced document. As mentioned, it is basically a combination of the log:content of the IRI, which is then parsed as N3 using log:parsedAsN3. So, the results creates a new Formula (possibly recursive) based on that document. It is then used for subsequent reasoning passes, effectively including into the dataset.

To do what I think you want, it might be best to associate a class such as rdfx:TripleTermRefTransparent to the embedded triple which would allow reasoners that understood this to treat them transparently. Perhaps something like the following:

:superman :can :fly {| a rdfx:TripleTermRefTransparent; :stated :LoisLane |} .

But I could misunderstand your thinking here. In N3, this applies to Named Graphs/Formulae, RDF-star uses embedded triples, so you would need to describe the embedded triple as such. Of course, it potentially be done using a predicate with a domain or range including rdfx:TripleTermRefTransparent.

hartig commented 3 years ago

@rat10 your response now is primarily about the case of defining the semantics of a vocabulary that, on top of the current opacity-by-default semantics, may allow users to say that some of their embedded triples can be interpreted with referential transparency instead. However, that's not the case that I was asking you to elaborate on. In contrast, I explicitly asked you to elaborate on the opposite case; that is, how to define the semantics of a vocabulary that, on top of a possible transparency-by-default semantics, allows users to say that some of their embedded triples are meant to be treated as referentially opaque. According to your text that I have quoted in my previous question, this should "of course" be possible. So, when you write "of course," I suppose you know how to do it and you seem to assume that every one else knows as well. Unfortunately, I don't. Please let me know.

rat10 commented 3 years ago

@gkellogg Thanks! I think you understood me correctly. I can't be completely sure as long as I haven't a better grasp of N3 but your proposal to type a triple term as rdfx:TripleTermRefTransparent looks good. But maybe rdf:type is a bit too generic, and it wouldn't immediatly work for graphs and datasets (as they are not of type triple term). However I do find that important as otherwise it's just too tedious to be useful for more than very special cases. So we would either have to define another class to cover them - something like rdfx:AllTripleTermsContainedHereinRefOpaque - or go with the idea of defining a special property instead of using RDF's type mechanism.

@hartig In the opposite case, where the semantics defaults to referentially transparent triple terms, a statement declaring some term as referentially opaque could look like this (in the syntax I sketched above):

<<:s :p :o>> rdfx:semantics rdfx:TripleTermRefOpaque

or, to declare all embedded triples in some graph or dataset as referentially opaque, like this:

:GraphOrDataset rdfx:semantics rdfx:TripleTermRefOpaque

In a syntax following @gkellogg's proposal:

:s :p :o .
<<:s :p :o>> a rdfx:TripleTermRefOpaque .

and

:s :p :o  {| a rdfx:TripleTermRefOpaque; :y :z |}

respectively.

pchampin commented 3 years ago

Regardless of the default semantics of embedded triple, I don't think that overriding that semantics on a per-triple basis is the way to go... What would be the use case for stating that :superman :can :fly is transparent (hence equivalent to :clark :can :fly) while :superman :can :seeThroughWalls is opaque (pun not intended)?

A much better way is, IMHO, to do that on a per-property basis. Some properties (e.g. "strict" provenance) are about the opaque triple, while other properties (e.g. relation qualifiers) are about transparent triples / statements.

This could be achieved by creating subclasses of rdf:Property, e.g.:

:superman owl:sameAs :clark.
:clark :said << :clark :likes :lois >>.
# does NOT entail :clark :said << :superman :likes :lois >>

:since a x:SubjectTransparentProperty.
:clark :likes :lois {| :since "1938-06" |}.
# DOES entail << :superman :likes :lois >> :since "1938-06"

(that is, assuming 'referential opacity' as the default)

hartig commented 3 years ago

@rat10 What you have done now is to mint two URIs (rdfx:semantics and rdfx:TripleTermRefOpaque). That's fine but the more important aspect (and the one that I don't see how it can be done) is: How do you propose to define the semantics of these URIs such that they can be used on top of a reasoner that implements a possible transparency-by-default semantics for embedded triples?

pchampin commented 3 years ago

@rat10, I'll try to succinctly explain why the rdfx:TripleTermRefOpaque you propose above can not work.

If the default semantics is transparency, it means that, whenever we have :superman owl:sameAs :clark, we also have:

<< :superman :can :fly >> owl:sameAs << :clark :can :fly >>.

and so any triple involving << :superman :can :fly >> would automatically entail a similar triple where you replace this triple with << :clark :can :fly >>. In particular:

:lois :knowsThat << :superman :can :fly >>.
   # automatically entails
:lois :knowsThat << :clark :can :fly >>.

No additional triple can prevent that from happening, because RDF semantics is monotonic. Quoting https://www.w3.org/TR/rdf11-mt/#dfn-monotonic :

a semantic extension cannot "cancel" an entailment made by a weaker entailment regime

So there is no conformant way to define the semantics of rdfx:TripleTermRefOpaque to cancel or prevent the entailments mandated by the default semantics.

hartig commented 3 years ago

@rat10 it seems you deleted your comment that you posted last night (11:46pm CEST). Why? Let me respond to it nonetheless:

The first thing is to be able to declare that some embedded triple is meant to be referentially opaque or transparent. Every mechanism that isn't purely implemented in out-of-bands means like application logic, verbal communication etc depends on such an explicit declaration.

Right. Such an "explicit declaration" is exactly what I am after with my previous question (How do you propose to define the semantics of these URIs [...]?)

I can then instruct my reasoner to treat IRIs in embedded triples as referentially transparent, i.e. apply all the entailments on them that it applies to all other IRIs. I don't understand why you don't see how this can be done. I haven't yet implemented a reasoner myself but after all this fuzz about preventing certain entailments I fail to see how it can be difficult to allow them.

Now you seem to consider again the direction that I was not interested in getting an answer to (the direction where we have a opacity-by-default semantics and want to allow users to say that some of their embedded triples can be interpreted with referential transparency instead). I am interested in getting an answer from you about the direction where we have a transparency-by-default semantics and want to allow users to say that some of their embedded triples are meant to be treated as referentially opaque.

Let's try with a simple example: Consider a triple (:s, :p, :o) and assume your reasoner uses an entailment regime in which, based on that triple, the following triple can be inferred: (:s, :p, :o2). Now, assume you have a (nested) triple t1 in which the triple (:s, :p, :o) is an embedded triple; e.g., t1 could be the following triple:

((:s, :p, :o), :mp, :mo)

Given this nested triple, by applying a transparency-by-default semantics, your reasoner infers and emits the following triple t2:

((:s, :p, :o2), :mp, :mo)

Next, the following triple t3 arrives:

((:s, :p, :o), rdfx:semantics, rdfx:TripleTermRefOpaque)

What does your reasoner do now? Based on the special semantics that you seem to have in mind for the vocabulary used in triple t3 (and that I am asking you to define), the triple t2 is not a valid inference anymore. Hence, it was a mistake of your reasoner to emit t2 earlier.

So, here's my question again: assuming a transparency-by-default semantics for RDF-star, how do you propose to define the semantics of your proposed vocabulary (URIs rdfx:semantics and rdfx:TripleTermRefOpaque) such that your reasoner would not make such mistakes?

rat10 commented 3 years ago

@rat10 it seems you deleted your comment that you posted last night (11:46pm CEST). Why?

Because while subsequently working on an answer to @pchampin's post I realized my mistake and as this WYSIWYG editor here doesn't provide a button to strikethrough text I just deleted the whole post. No need to bring it back and answer nonetheless. And now I'm thinking...

rat10 commented 3 years ago

@pchampin Thanks for the explanation! It took me a while to realize my mistake... Now to your other https://github.com/w3c/rdf-star/issues/170#issuecomment-842112559 above, before that explanation: I guess you are right that a triple-based approach doesn't make much sense. I had thought that the statement level would be sort of a common ground where both default referentiality semantics meet in the middle, but it does indeed seem to not work out. However a properties-based approach doesn't make much sense neither. I don't want to come into a situation where I have to assert per property that embedded triples are considered transparent. Again, consider how property graphs can be understood as a way to structure complex information objects in primary relations and secondary attributes: any property can become part of a secondary attribution, not just the usual provenance related suspects. The same for n-ary relations. Therfor I want to be able to put a safe distance between my data and the proposed semantics at least per graph, if not even per dataset/application/(for life ;-)/etc.

I was for a brief period of time hoping that turning the table and define referential opacity per property would be a viable solution as most if not all of your use cases are related to very specific and provenance related tasks. Only while drafting a comment with such a proposal I eventually realized how that can't work (and hence deleted my response to @hartig).

So I agree that to implement the proposed semantics the default per specification has to be referential opacity. But if we provide a way to declare ("switch on") referential transparency per graph or dataset or application we would at least put both semantics on relatively equal footing, thereby allowing people to better experience and compare in practice how useful, mindblowing or catastrophic referential opacity on embedded triples is. We could still advice to turn that switch on by default ;-)

However there's a few things to consider:

Therefor, in the end I fear I'm still not in favor of this semantics and would rather implement it via literals, but that's not the topic of this issue. To advance this issue let me modify the proposed vocabulary like this:

rdfx:semantics
rdfx:tripleTermsTransparent
rdfx:tripleTermsOpaque

The latter is not strictly necessary but could be useful in practice to discriminate sloppy authoring from a conscious choice of default opacity. To be used on some graph like this (assuming that <> refers to the graph in which it is used, not to the dataset that contains that graph, or otherwise to the local document or otherwise to some enclosing self-contained snippet of RDF):

<> rdfx:semantics rdfx:tripleTermsTransparent .

And what if this declaration is made in the default graph of a dataset? Is it then also valid for all named graphs in the dataset?

TallTed commented 3 years ago

(@rat10 -- For future reference, while the WYSIWYG editor doesn't include strikethrough, you can make it happen, with <strike> tags, a la, <strike>struck through</strike> which produces struck through.)

rat10 commented 3 years ago

Regardless of the default semantics of embedded triple, I don't think that overriding that semantics on a per-triple basis is the way to go... What would be the use case for stating that :superman :can :fly is transparent (hence equivalent to :clark :can :fly) while :superman :can :seeThroughWalls is opaque (pun not intended)?

Why not make the nodes :superman and :clark referentially opaque? Nothing else would suffer from co-denotation etc.

A much better way is, IMHO, to do that on a per-property basis. Some properties (e.g. "strict" provenance) are about the opaque triple, while other properties (e.g. relation qualifiers) are about transparent triples / statements.

This could be achieved by creating subclasses of rdf:Property, e.g.:

:superman owl:sameAs :clark.
:clark :said << :clark :likes :lois >>.
# does NOT entail :clark :said << :superman :likes :lois >>

:since a x:SubjectTransparentProperty.
:clark :likes :lois {| :since "1938-06" |}.
# DOES entail << :superman :likes :lois >> :since "1938-06"

(that is, assuming 'referential opacity' as the default)

What if that property :since is used in another annotation where referential opacity is desired? You'd have to define a new, referentially transparent sub-property of :since - like :since-transparently- to avoid that problem. So you might have to define a lot of explicitly referentially transparent variants of properties that in all other circumstances "are" referentially transparent anyways. This is madness.

What about the following example:

:alice :bought :car .
<< :alice :bought :car>>
    :color :red ; 
    :type :laundolet ;
    :said :bob ; 
    :on :monday .

I see no reason why this embedded triple should be referentially opaque. But which property would you use to declare the embedded triple as referentially transparent? One of them would be enough, right? If that one property would make the embedded triple referentially transparent, why not define one explicitly and leave those poor established properties alone, like:

:alice :bought :car .
<< :alice :bought :car>>
    rdfx:semantics rdfx:tripleTermsTransparent ;
    :color :red ; 
    :type :laundolet ;
    :said :bob ; 
    :on :monday .
hartig commented 3 years ago

@rat10

What about the following example:

:alice :bought :car .
<< :alice :bought :car>>
    :color :red ; 
    :type :laundolet ;
    :said :bob ; 
    :on :monday .

I see no reason why this embedded triple should be referentially opaque. But which property would you use to declare the embedded triple as referentially transparent? One of them would be enough, right?

No. I think the intention of treating an embedded triple as referentially transparent is part of the meaning of such a property, independent of other triples in which the embedded triple is mentioned with other properties. For instance, in your example, if you consider :on to be such a property and you have that the IRI :alice denotes the same thing as :aliceSmith, then you may infer the triple

<< :aliceSmith :bought :car >> :on :monday

from the triple

<< :alice :bought :car >> :on :monday

However, this interpretation of the property :on should not automatically lead to inferences from other triples in which the embedded triple << :alice :bought :car >> is mentioned with another property. For instance, consider the following triple.

<< :alice :bought :car >> myvocab:foundIn <http://bob.name/news.ttl>

In this triple, I am using the property myvocab:foundIn which I intend to have the meaning that the embedded triples it is used for are referentially opaque. In other words, I intend to use it to say which exact triple I have found in which file. At the same time, I may also want to use the :on property with its aforementioned meaning, which can give me the aforementioned inference, but I do not want the same kind of inference to happen for the myvocab:foundIn triple.

Coming to your other point:

What if that property :since is used in another annotation where referential opacity is desired? You'd have to define a new, referentially transparent sub-property of :since - like :since-transparently- to avoid that problem.

Your observation is right. However, I don't see an issue here. If there are two properties that have two different meanings, then they better be denoted by two different IRIs. For instance, in your data, you might also want to use a "foundIn" property, similar to my example above, but you may want this property to allow for referentially transparent inferences for the embedded triples for which you use this property. So, you actually want your "foundIn" property to have a different meaning than my myvocab:foundIn property. Therefore, it would make sense, I think, to mint another IRI for the property with the meaning that you want. In my opinion, having two such properties in this case brings semantic clarity (rather than madness).

So you might have to define a lot of explicitly referentially transparent variants of properties that in all other circumstances "are" referentially transparent anyways.

You make it sound as if almost every property defined in existing RDF vocabularies can meaningfully be used in triples that contain embedded triples. I don't think that is the case. Certainly, there are some vocabularies with some of their properties for which this is possible but I don't think this holds for the majority of existing properties in existing vocabularies.

One more example to illustrate this last point further: Your example data contains the following triple.

<< :alice :bought :car>> :color :red

What is the meaning of the :color property? Let me consider two cases.

Case 1. The property has been introduced to be used in triples in which we want to say what the color of something is. In this case, using the property in your triple then means either (under a referential opacity interpretation) that the embedded triple << :alice :bought :car>> is red or (under a referential transparency interpretation) that the statement represented by the embedded triple is red. I assume that none of these two interpretations is what you intended. So, the property should not have been used, but a different property would have been needed, say :colorOfObjectInEmbeddedTriple. Note that this need for such a different property is independent of whether that property is meant to treat the embedded triple referentially opaque or referentially transparent.

Case 2. The property has indeed been defined explicitly for embedded triples and is meant to be used when we want to say what the color of the thing in the object of an embedded triple is (i.e., the object :car in your example triple). In this case, it is inherent in the meaning of this property that it considers the embedded triple for which it is used as referentially transparent. So, it can be defined and used in this way, but then it is a different property than a property that is meant to be used to directly say what the color of something is (i.e., as per case 1).

rat10 commented 3 years ago

@hartig An even longer reply to your already long reply I fear but you are making some interesting points.

@rat10

What about the following example:

:alice :bought :car .
<< :alice :bought :car>>
    :color :red ; 
    :type :laundolet ;
    :said :bob ; 
    :on :monday .

I see no reason why this embedded triple should be referentially opaque. But which property would you use to declare the embedded triple as referentially transparent? One of them would be enough, right?

No. I think the intention of treating an embedded triple as referentially transparent is part of the meaning of such a property, independent of other triples in which the embedded triple is mentioned with other properties.

Jumping in right here as I think we mis-understood each other. I’m not trying to declare the embedded triple (as a type) but its occurrence as subject of this annotation block as referentially transparent. I understand the annotation rdfx:semantics rdfx:tripleTermsTransparent as refering to this occurrence of the embedded triple << :alice :bought :car>>. Well, okay, I see now that this is problematic. OTOH - and that was of course my point all along - how can one not understand :on :mondayor even :color :red as referring to a particular occurrence? After all it is common sense that no one buys a car every monday, and cars are not all red. I’m also inclined to excuse myself and generalize from this mistake to a general rule: one only gets into such troubles if the semantics tries to formalize something that is not intuitive and doesn’t reflect common usage. The initial concept of RDF-star, the seminal example, was intuitive and is what everybody expects RDF-star to provide a solution to. The problems come with the re-definition of RDF-star as a crippled version of N3. Maybe a way out of this problem would be to allow declaring referential transparency not on embedded triples but only on occurrences of embedded triples (see https://github.com/w3c/rdf-star/issues/169).

For instance, in your example, if you consider :on to be such a property and you have that the IRI :alice denotes the same thing as :aliceSmith, then you may infer the triple

<< :aliceSmith :bought :car >> :on :monday

from the triple

<< :alice :bought :car >> :on :monday

However, this interpretation of the property :on should not automatically lead to inferences from other triples in which the embedded triple << :alice :bought :car >> is mentioned with another property.

Okay , so you safe other occurrences of the same embedded triple from co-reference but instead you allow referential transparency to creep into all other embedded triples that the property is used to annotate. That's a bad deal, if you ask me. Your proposal to define referential transparency per annotation property is much more prone to the problem that you try to avoid: transparent semantics creeping into other annotated triples that you have no intention to become transparent. Properties get re-used all the time and once you define one as making annotated embedded triples referentially transparent, that re-defines the semantics of all embedded triples so annotated in the past, present and future of your graph or dataset. So IMO you’re bitten much harder than if only the embedded triple - here understood as a type, not an occurrence - was defined as transparent. But I understand how you came to this proposal: it is probabaly (I haven' thought it through properly, therefor "probably") the only option you have if embedded triples are not occurrences. The main problem is again the fixation of the proposed semantics on triple types while annotations most naturally are concerned with occurrences.

As a thought experiment we could model annotated triples with blank nodes instead of embedded triples, as it would be natural in basic RDF , like so:

:alice :bought [
    rdf:value :car ;
        :on :monday ;
    … etc ]

The blank node ensures that we are talking about a specific incident, an occurrence - although the blank node as an existential quantifier actually doesn’t say how many such incidents did or will occur. This elegantly hides the problem of triples and occurrences. The intuitive reading however IMHO will be that the triples describe one occurrence of a car-buying event. Okay, I could come up with another example, that a evokes more generalistic attitude:

:war a [ 
    rdf:value :badThing ;
    :for :everybody ;
    :on :anytime ]

So there is indeed some ambiguity hidden in the basic-RDF-ish, blank node based construct that I wasn't aware of... Still, or even more so, it is dangerous to fix this ambiguity in one way or another!

Now that might be water on the mill of referential opacity as a "prudent approach", but OTOH it also strengthens the position that RDF-star is a fork of RDF, changing it in very fundamental ways. An imprtnat argument in this context is how easy it is to go from referential opacity to transparency. So far I see it as rather difficult if its not declared in a rather sweeping way, per graph or per dataset.

I do uphold my intuition that annotations, as they add detail and specificity to a more general 'anchor' statement, lend themselves much more easily to an understanding as occurrences than as triples that mean the same anytime, everywhere. The triple to which the annotations are attached is itself not an occurrence. The whole construct of triple, embedded triple and annotations is strictly speaking still not an occurrence, but in a much more specific way than the original triple. The annotated triple in the context of that annotation construct however IMO is clearly an occurrence, as it is only fully interpretable, it's meaning can only be fully appreciated as part of that more complex statement block.

It is of course also a question of which perspective you take: do you see any triple as bare data or do you differentate between data and construction data (not meta data, but data that builds more complex constructs from and within the simplistic triple-ish graph). Bare triples just describe, constructs as a whole describe, but triples that make up constructs occur (in those constructs). That is probably the price one has to pay for extending the basic triple formalism. Well, we are at the boundaries of that "painfully simplistic" formalism - we have to be careful not to break anything nor to fall off the cliff.

For instance, consider the following triple.

<< :alice :bought :car >> myvocab:foundIn <http://bob.name/news.ttl>

In this triple, I am using the property myvocab:foundIn which I intend to have the meaning that the embedded triples it is used for are referentially opaque. In other words, I intend to use it to say which exact triple I have found in which file. At the same time, I may also want to use the :on property with its aforementioned meaning, which can give me the aforementioned inference, but I do not want the same kind of inference to happen for the myvocab:foundIn triple.

So IIUC given the following snippet

:alice :bought :car  ;
    owl:sameAs :aliceSmith .
<< :alice :bought :car >> 
    myvocab:foundIn <http://bob.name/news.ttl> ;
    :on :monday .
:on rdfx:semantics rdfx:tripleTermsTransparent .

you would like to be able to infer the following (and nothing more!):

:aliceSmith :bought :car .
<< :aliceSmith :bought :car >> 
    :on :monday .

However now every embedded triple annotated with :onis fair game to further entailments. I fail to see how this approach provides much advantage over an occurrence based approach. It might indeed be the only way to define referential transparency of embedded triple types on a more granular level than whole graphs or datasets.

Coming to your other point:

What if that property :since is used in another annotation where referential opacity is desired? You'd have to define a new, referentially transparent sub-property of :since - like :since-transparently- to avoid that problem.

Your observation is right. However, I don't see an issue here. If there are two properties that have two different meanings, then they better be denoted by two different IRIs. For instance, in your data, you might also want to use a "foundIn" property, similar to my example above, but you may want this property to allow for referentially transparent inferences for the embedded triples for which you use this property. So, you actually want your "foundIn" property to have a different meaning than my myvocab:foundIn property. Therefore, it would make sense, I think, to mint another IRI for the property with the meaning that you want. In my opinion, having two such properties in this case brings semantic clarity (rather than madness).

So you might have to define a lot of explicitly referentially transparent variants of properties that in all other circumstances "are" referentially transparent anyways.

You make it sound as if almost every property defined in existing RDF vocabularies can meaningfully be used in triples that contain embedded triples. I don't think that is the case.

Indeed, and I’ve been trying for months now to get it into the heads of the proponents of the proposed semantics that this is indeed the case.

Certainly, there are some vocabularies with some of their properties for which this is possible but I don't think this holds for the majority of existing properties in existing vocabularies.

Rather the opposite: not many properties (and use cases) suggest referential opacity. Explainable AI: certainly yes. But apart from that not even provenance related vocabularies do per se require or even suggest referential opacity. In my example above the :on and :said annotations would work perfectly fine when :alice is being replaced by :aliceSmith. And my application would - given how everything else on the semantic web works - expect it, assume it, welcome it and even take it for granted!

Another example that I've provided repeatedly: property graph style modelling has become an important use case for RDF-star, but property graphs represnted in RDF are often just a kind of n-ary relation, a primary relation with secondary attributes. This has nothing to do with provenance anymore, any property can be the secondary attribute to some primary relation. That is just a matter of perspective, of what seems the most important aspect of some complex subject matter, of modelling decisions. It’s the difference between Alice taking the train to Berlin on Monday or Alice travelling to Berlin on Monday by train or A train departing to Berlin on Monday with Alice on board etc. So, yes, indeed: almost every property defined in existing RDF vocabularies can meaningfully be used in triples that contain embedded triples.

One more example to illustrate this last point further: Your example data contains the following triple.

<< :alice :bought :car>> :color :red

What is the meaning of the :color property? Let me consider two cases.

Case 1. The property has been introduced to be used in triples in which we want to say what the color of something is. In this case, using the property in your triple then means either (under a referential opacity interpretation) that the embedded triple << :alice :bought :car>> is red or (under a referential transparency interpretation) that the statement represented by the embedded triple is red. I assume that none of these two interpretations is what you intended. So, the property should not have been used, but a different property would have been needed, say :colorOfObjectInEmbeddedTriple. Note that this need for such a different property is independent of whether that property is meant to treat the embedded triple referentially opaque or referentially transparent.

Take care, your rear wheel is overtaking your car ;-) Per the spec that we are working on here the embedded triple doesn’t refer to itself but to the snytactic representation of the triple that it describes, bit by bit, inside the pairs of double pointy brackets. Of course, we might explore the question if those bits are red or if RDF-star can only be used with certain font-colors and sizes.

Case 2. The property has indeed been defined explicitly for embedded triples and is meant to be used when we want to say what the color of the thing in the object of an embedded triple is (i.e., the object :car in your example triple). In this case, it is inherent in the meaning of this property that it considers the embedded triple for which it is used as referentially transparent. So, it can be defined and used in this way, but then it is a different property than a property that is meant to be used to directly say what the color of something is (i.e., as per case 1).

To help you out with further nitpicking, you could argue that it’s unclear if the :color refers to Alice or the car or the property. This does indeed showcase a shortcoming of the RDF-star approach compared to property graphs. This was recently brought up in another issue (but I can’t remeber which) as the question if RDF-star can differentiates relations between objects from attributes of objects. A variant of the primary/secondary distinction. But I disgress... Would you also propose to introduce :Alice-color or :Subject-color properties? Or have you just made the point that embedded triples can’t be used with properties that haven’t been explicitly defined for use with embedded triples? How’s that for an elegant alternative to the syntactic verbosity of RDF standard reification.

TallTed commented 3 years ago

To help you out with further nitpicking, you could argue that it’s unclear if the :color refers to Alice or the car or the property. This does indeed showcase a shortcoming of the RDF-star approach compared to property graphs.

On the contrary, this showcases the, well, badly constructed example you put forth, quite apart from any possible shortcoming in RDF-star vis a vis property graphs.

RDF-star doesn't cause the problem here; you put a :color (and :type) attribute where it simply doesn't belong, where it is nonsensical, and this results, as might be expected, in nonsensical inferences. GIGO, after all!

Serious consideration of any arguments here demand rigorous construction of the examples upon which the arguments are based. It is trivial to say, "This flawed data leads to flawed conclusions!" Well constructed data must be used for our discussions, else all conclusions are inherently flawed, and we might as well throw in the towel now, wasting no further time on serious consideration of nonsense.

rat10 commented 3 years ago

@TallTed answered here

rat10 commented 3 years ago

@TallTed And don't forget that it's the proposed semantics that makes RDF-star so brittle. It is still advertized as the go-to solution for all meta modelling needs: reification, property graphs, n-ary relations, you name it. Yet, as Olaf suggests above - and you seem to endorse that - it needs very careful handling of properties, even minting properties specifiaclly for use with embedded triples. That's a contradiction, wouldn't you agree? It would be much easier to name the few properties for which referential opacity makes sense but, as we discussed already, that approach can't work because entailments once made can't be taken back later. So it's a problem with the proposed semantics and I don't know any easy solution. Don't blame the messenger...

TallTed commented 3 years ago

Well, yes, anyone can say anything about anything. Truth, lies, exaggerations, minutely exact descriptions... All are possible.

So, there needs to be a way to say "these statements were asserted by asserter, on date, at time" without inherently asserting those statements -- which you may know to be false. Named graphs are one way to do this, but people don't like single-triple-graphs nor "standard" reification, so there is desire for a "sugar" or "shortcut" by which to identify such a single triple, and say things about it ("<< :a a :b >> :truthiness false").

"Entailments once made can't be taken back later" is an argument against materialization of inferred data, or at least, against materialization in any way that doesn't allow for ignoring or deleting erroneous or otherwise problematic entailments/inferences later. I think it applies roughly equally to both RDF and RDF-star.

So -- you've said that << :alice :bought :car>> :color :red. You've not said :car :color :red. You've not said :alice :color :red. You've said that << :alice :bought :car>> has a :color which is :red.

Now I need to ignore your statement, or at least describe it as nonsense. I guess I have to do something like << << :alice :bought :car>> :color :red >> :veracity "0"^^example:percent ... which is OK, I can do that.

I don't buy into your assertion that we must mint properties specifically for use with embedded triples. I think we need to be able to discern between I assert that "Jane said 'Moon madeOf greenCheese'." and I assert that "Jane said 'Moon madeOf greenCheese', which I also say." --- which can be addressed by use of << >> and {| |}, among other possibilities.

pchampin commented 3 years ago

@rat10 just one little thing:

Per the spec that we are working on here the embedded triple doesn’t refer to itself but to the snytactic representation of the triple that it describes, bit by bit,

No, not at all. That's simply not what the spec is saying.

hartig commented 3 years ago

@rat10

[...] but, as we discussed already, that approach can't work because entailments once made can't be taken back later. So it's a problem with the proposed semantics and I don't know any easy solution. Don't blame the messenger...

If "the proposed semantics" is meant to refer to a opacity-by-default semantics for RDF-star, then you are wrong. It is the idea of a transparency-by-default semantics that would cause entailments that cannot be taken back when trying to selectively achieve referential opacity for selected embedded triples. I thought you had understood that already ("So I agree that [...] the default per specification has to be referential opacity").

rat10 commented 3 years ago

@rat10 just one little thing:

Per the spec that we are working on here the embedded triple doesn’t refer to itself but to the snytactic representation of the triple that it describes, bit by bit,

No, not at all. That's simply not what the spec is saying.

@pchampin Please, enlighten me!

rat10 commented 3 years ago

@hartig

@rat10

[...] but, as we discussed already, that approach can't work because entailments once made can't be taken back later. So it's a problem with the proposed semantics and I don't know any easy solution. Don't blame the messenger...

If "the proposed semantics" is meant to refer to a opacity-by-default semantics for RDF-star, then you are wrong. It is the idea of a transparency-by-default semantics that would cause entailments that cannot be taken back when trying to selectively achieve referential opacity for selected embedded triples. I thought you had understood that already ("So I agree that [...] the default per specification has to be referential opacity").

Yes, I understood that already. The sentence you cite starts with It would be much easier to name the few properties for which referential opacity makes sense ..... What I was trying to express is that I tried to make referential transparency the default and only define a few properties as referentially opaque, following your proposal but under reversed pre-conditions. However, as you explained, that can't work. It is a pity 'though as that would have been a rather clean solution, given that there are not so many properties used in provenance and explainable AI use cases.

rat10 commented 3 years ago

Well, yes, anyone can say anything about anything. Truth, lies, exaggerations, minutely exact descriptions... All are possible.

So, there needs to be a way to say "these statements were asserted by asserter, on date, at time" without inherently asserting those statements -- which you may know to be false. Named graphs are one way to do this, but people don't like single-triple-graphs nor "standard" reification, so there is desire for a "sugar" or "shortcut" by which to identify such a single triple, and say things about it ("<< :a a :b >> :truthiness false").

Unasserted assertions are an orthogonal issue. For example in post-WW2-Germany there was an infamous saying that

<< :Hitler :wasn't :allBad >> :because :heBuiltTheAutobahn

Now I sure as hell wouldn't want to have the statement that :Hitler :wasn't :allBad in my triple store but I wouldn't mind at all if :Hitler was replaced by his WikiData identifier. The ability to make unasserted statements is valuable but there's no definitive reason why they have to be referentially opaque: in some cases it will be welcome, in some cases it won't hurt, in some cases it might hurt.

"Entailments once made can't be taken back later" is an argument against materialization of inferred data, or at least, against materialization in any way that doesn't allow for ignoring or deleting erroneous or otherwise problematic entailments/inferences later. I think it applies roughly equally to both RDF and RDF-star.

I was referring to the argument that in the proposed semantics referential opacity has to be the default, the starting point, for the envisioned use cases - Superman problem, explainable AI - to work. If you start from referential transparency and start your reasoner, all possible entailments will be derived. You can't then at some point say: "On, no, that one not, that property is opaque. Please retract!" because RDF is strictly monotonic. So you have to start from an opaque semantics to implement the use cases that the proposed semantics envisions. (Pierre-Antoine or Olaf would surely be better able to explain this, and my attempt at explaining this doesn't mean that I endorse this approach).

So -- you've said that << :alice :bought :car>> :color :red. You've not said :car :color :red. You've not said :alice :color :red. You've said that << :alice :bought :car>> has a :color which is :red.

Now I need to ignore your statement, or at least describe it as nonsense. I guess I have to do something like << << :alice :bought :car>> :color :red >> :veracity "0"^^example:percent ... which is OK, I can do that.

I'm in general a bit reluctant to craft my examples to thoroughly as that can hide assumptions that I'm not aware of myself. But given your irritation with the << :alice :bought :car>> :color :red statement, let's replace that with

<< :alice :bought :car>> :paymentMethod :CreditCard .

Is that an acceptable example to you? It is clearly annotating the property, not a node. Maybe you want something even more related to the buying activity as a whole:

<< :alice :bought :car>> :reason :FearOfMissingOut .

I hope that helps to make you understand my point. Both annotations are decidely not concerned with explainability of inferences, provenance etc. They wouldn't suffer from referential transparency in any way but they might suffer from referential opacity because the connection between the assertion and the annotation might get lost when entailments are applied to the asserted triple, e.g. replacing :alicewith :AliceSmith, and the embedded triple doesn't follow suite. Another example is one that I gave a few weeks ago in an email to the list. It was:

<< :proposedSemantics :are :nonsense >> :said :thomas

It makes a difference if :nonsense here is referentially transparent or opaque. If I want to scandalize Thomas' wording I need referential opacity. If I want to focus on the frustration he expresses - possible reasons, ways forward etc - I might want to use :nonsense in a referentially transparent way. Both are legitimate use cases.

I don't buy into your assertion that we must mint properties specifically for use with embedded triples.

It's what Olaf says. His way of solving this problem is to mint new properties. He just believes that won't be the needed too often. Check his comment above. I got convinced that it's not possible to declare referential opacity per property on the background of default referential transparency. Referential tranparency can be declared for the whole graph or dataset but that's not the sort of fine grained control that one would aim for (and that you called for, understandably). So far I haven't become aware of a pretty solution. If I overlooked something please feel free...

I think we need to be able to discern between I assert that "Jane said 'Moon madeOf greenCheese'." and I assert that "Jane said 'Moon madeOf greenCheese', which I also say." --- which can be addressed by use of << >> and {| |}, among other possibilities.

As I tried to explain above this is an orthogonal issue. And the shorthand syntax {| |} doesn't give you that ability, only the << >> syntax does.

lisp commented 3 years ago

If one posits "referential opacity", is working with a dataset which includes

:proposedSemantics :are :nonsense .
<< :proposedSemantics :are :nonsense >> :said :thomas .

arranges to bind respective variables to the subject term of the first triple and the subject term of the embedded triple, and then applies sameTerm to the two variables, does it return true?

TallTed commented 3 years ago

@lisp -- Could you rephrase? Not least, could you include the ASK query about which you're asking whether it returns true? Proper capitalization and punctuation of your sentence(s?) would help a lot in parsing, too.

TallTed commented 3 years ago

I think we need to be able to discern between I assert that "Jane said 'Moon madeOf greenCheese'." and I assert that "Jane said 'Moon madeOf greenCheese', which I also say." --- which can be addressed by use of << >> and {| |}, among other possibilities.

As I tried to explain above this is an orthogonal issue. And the shorthand syntax {| |} doesn't give you that ability, only the << >> syntax does.

How orthogonal?

With the << >> syntax, I can assert --

<< :Moon :madeOf :cheeseGreen >> :assertedBy :Jane

-- in which I am not asserting :Moon :madeOf :cheeseGreen, only that Jane asserted such. The << >> notation gives me referential opacity.

With the {| |} syntax, I can say --

:Moon :madeOf :cheeseGreen {| :assertedBy :Jane |}`

-- in which I simultaneously assert that :Moon :madeOf :cheeseGreen and that Jane asserted the same. The {| |} notation gives me referential invisibility.

(Edited to fix brain scrambled example after long day.)

hartig commented 2 years ago

PR #209 provides a proposal how selective referential transparency can be supported on top of the RDF-star semantics (by which referential opacity is the default).

pchampin commented 2 years ago

I propose the following course of action:

pchampin commented 2 years ago

This was discussed during today's meeting: https://w3c.github.io/rdf-star/Minutes/2021-10-15.html#r01