w3c / N3

W3C's Notation 3 (N3) Community Group
48 stars 18 forks source link

Design Patterns in N3 #42

Open william-vw opened 4 years ago

william-vw commented 4 years ago

I added a section on "design patterns" (tentative name) to the spec - the goal here is to present "patterns" to solve common problems in N3. I think it could be a neat way to help developers leverage N3 to the fullest for solving their problems.

As I was reading the original RDF primer I found some stuff on representing n-ary relations, so I added a part on that topic (also based on the venerable WG's work). The part on using collections for this purpose is under-stated currently (mostly cause I ran out of time, but also cause I haven't really represented them in that way). Also note a prior post on the topic of representing n-ary relations.

Anyone, please feel free to expand this section whenever you have a useful idea. @josd feel free to explain the use of collections to represent n-ary relations :-)

domel commented 4 years ago

Before adding, please give your opinion. Design pattern no 2 that I would like to add is an RDF reification in cited graphs. Standard reification:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x a rdf:Statement ;
    rdf:subject :a ;
    rdf:predicate :b ;
    rdf:object :c ;
    :certainty 0.5 .

Cited graphs:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x :graph { :a :b :c } .
_:x :certainty 0.5 .
josd commented 4 years ago

An example of n-ary relations using n-1-tuples is the :moves predicate which is used to solve the "Towers of Hanoi" puzzle

@prefix math: <http://www.w3.org/2000/10/swap/math#>.
@prefix list: <http://www.w3.org/2000/10/swap/list#>.
@prefix e: <http://eulersharp.sourceforge.net/2003/03swap/log-rules#>.
@prefix : <http://josd.github.io/eye/reasoning#>.

# ?M is the sequence of moves to move ?N disks from ?X to ?Y using ?Z as intermediary
{(?N ?X ?Y ?Z) :moves ?M} <=
{   ?N math:greaterThan 1.
    (?N 1) math:difference ?N1.
    (?N1 ?X ?Z ?Y) :moves ?M1.
    (?N1 ?Z ?Y ?X) :moves ?M2.
    (?M1 ((?X ?Y)) ?M2) list:append ?M.
}.
{(1 ?X ?Y ?Z) :moves ((?X ?Y))} <= true.

The complete example is available at https://github.com/josd/eye/tree/master/reasoning/hanoi

It would be very interesting @william-vw to see the same example expressed in your way :-)

domel commented 4 years ago

@josd, it seems to me that the specification should include a more concise example. Probably using builtins is also not good idea because, we still don't know which ones will be choose.

josd commented 4 years ago

The simplest example I can imagine is a calculated feature :age using a rule like {(?PERSON ?BIRTHDAY) :age ?AGE} <= ...

william-vw commented 4 years ago

@josd I gave the example a quick try - i.e., using sets of binary relations - but getting an error in Eye:

GET file:///d:/git/eye/reasoning/hanoi/hanoi4.n3 ** ERROR ** gre ** error(permission_error(modify,static_procedure,(,)/2),context(system:assertz/1,_2248))

It executes using forward rules, but not getting any results, obviously. It works like a charm when using cited formulas with binary relations. I put the files here. Disclaimer: these were just quick attempts and not a robust effort.

But note that this is exactly the point - there will be cases where sets of binary relations are more suitable, and cases for where collections are better. As also outlined in the W3C WG on the topic, and why I reached out to add such cases to the spec :-)

josd commented 4 years ago

Thanks for moving this forward and your hanoi2.n3 is a very nice approach!

@prefix math: <http://www.w3.org/2000/10/swap/math#>.
@prefix list: <http://www.w3.org/2000/10/swap/list#>.
@prefix e: <http://eulersharp.sourceforge.net/2003/03swap/log-rules#>.
@prefix : <http://josd.github.io/eye/reasoning#>.

# ?M is the sequence of moves to move ?N disks from ?X to ?Y using ?Z as intermediary

{
    { ?i :num_disks ?N ; :from ?X ; :to ?Y ; :interm ?Z } :moves ?M
} 
<=
{   ?N math:greaterThan 1.
    (?N 1) math:difference ?N1.
    { ?i :num_disks ?N1 ; :from ?X ; :to ?Z ; :interm ?Y } :moves ?M1 .
    { ?i :num_disks ?N1 ; :from ?Z ; :to ?Y ; :interm ?X } :moves ?M2 .
    (?M1 ((?X ?Y)) ?M2) list:append ?M.
}.

{
    { ?i :num_disks 1 ; :from ?X ; :to ?Y ; :interm ?Z } :moves ((?X ?Y))
} 
<=
true.

{
    { ?i :num_disks 3 ; :from :left ; :to :right ; :interm :center }
    :moves ?M
} 
=>
{3 :answer ?M}.

and

eye --nope hanoi2.n3 --pass

indeed replies with the correct answer

3 :answer ((:left :right) (:left :center) (:right :center) (:left :right) (:center :left) (:center :right) (:left :right)). You clearly made your point :-) So after all, I am now convinced that cited graphs are indeed a viable way to go i.e. they are self explanatory.

The exception that you get with hanoi4.n3 is because you can't have a conjunction of triples in the conclusion part of a backward implication <=

william-vw commented 4 years ago

indeed replies with the correct answer

3 :answer ((:left :right) (:left :center) (:right :center) (:left :right) (:center :left) (:center :right) (:left :right)). You clearly made your point :-) So after all, I am now convinced that cited graphs are indeed a viable way to go i.e. they are self explanatory.

Thanks @josd - but I think that representing n-ary relations with collections can also be suitable in certain scenarios. Just see the hanoi example, where the "input" to the built-ins (e.g., list:append, math:difference) is given as a collection. IMO it would be awkward to do that with a blank node or cited formula (especially if there's arbitrary numbers of "operands").

From the WG article:

Some n-ary relations do not naturally fall into either of the use cases above, but are more similar to a list or sequence of arguments. ... In cases where all but one participant in a relation do not have a specific role and essentially form an ordered list, it is natural to connect these arguments into a sequence according to some relation, and to relate the one participant to this sequence (or the first element of the sequence).

In that case, the relation can indicate the role of all elements within the collection.

Re:

The exception that you get with hanoi4.n3 is because you can't have a conjunction of triples in the conclusion part of a backward implication <=

Aha, yes, that makes a lot of sense. It's so easy to start thinking about that conclusion as a single triple, due to the shorthand notation.

william-vw commented 4 years ago

Before adding, please give your opinion. Design pattern no 2 that I would like to add is an RDF reification in cited graphs. Standard reification:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x a rdf:Statement ;
    rdf:subject :a ;
    rdf:predicate :b ;
    rdf:object :c ;
    :certainty 0.5 .

Cited graphs:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x :graph { :a :b :c } .
_:x :certainty 0.5 .

Thanks. I think this is certainly useful.

I think some notes could accompany this pattern. Reification is explicitly defined as not being a form of quotation, but rather asserting the existence of a triple token (or instance), or, explicitly refer to a triple instance (in a way that is external to RDF; e.g., an IRI that relies on internal indices). In either case, they are intended to refer to a concrete realization of a triple, with the given subject, property and object, in some surface syntax. Then, one can attach metadata (e.g., provenance) to this reference of the triple instance. But, note that this concrete triple is not entailed by the reification (or, neither is the reification entailed by it).

In contrast, cited formulas are defined as an explicit way of quoting, i.e., directly expressing which document or message asserted what, and giving the ability to attach metadata to it. Similarly to reification however, the contents of the formula are not entailed by the quoting - i.e., one merely states that X or Y asserted this, and we're not saying that it is true.

Anyone, please feel free to comment. I've always found reification rather confusing - thinking it's doing things that it in actuality is not.

doerthe commented 4 years ago

I would be careful with the reification example. At least according to our common formalisation,

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x a rdf:Statement ;
    rdf:subject :a ;
    rdf:predicate :b ;
    rdf:object :c ;
    :certainty 0.5 .

is not the same as


@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x :graph { :a :b :c } .
_:x :certainty 0.5 .

Apart from the fact that reification has no semantics at all, we can already spot differences. Let's slightly modify your example and add a blank node. If we say

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x a rdf:Statement ;
    rdf:subject _:y ;
    rdf:predicate :b ;
    rdf:object :c ;
    :certainty 0.5 .

the _:y which is subject to our reified triple is quantified on top level. if we have another triple

_:y :p :o.

we know that both occurrences of _:y in the graph refer to the same resource.

If we replace the :a by _:y in your quoted graph, we have another situation. In


@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://example.org/> .
_:x :graph {_:y :b :c } .
_:x :certainty 0.5 .

The blank node is implicitly quantified within the citation, it thus means

∃:x. 
:x :graph {∃ :y. :y :b :c}.
:x :graph :certainty 0.5. 

This example already shows that the concepts are different. We can either leave out reification or carefully explain the differences.

domel commented 4 years ago

@doerthe Well, I am not saying that the examples are the same, but that they may be the same. As you wrote before, we do not know the semantics of reification. Moreover, we do not know what is :graph. This is one side of the coin. The other side of the coin is that the ways of RDF triples annotating, e.g. including trust, uncertainty, and temporal metrics may be different. You can use standard RDF reification, RDF*, Singleton Property, Named Graphs etc. Annotating an RDF triple in these approaches is different, they may be interpreted differently, but in practice, people use them for exactly the same purposes. In conclusion, I agree to clarify the points you mentioned. Please, feel free to edit that section in the spec. Perhaps it is also worth considering introducing something like log:graph to N3 vocabulary, which is to enable graph description and annotation.

pchampin commented 4 years ago

but in practice, people use them for exactly the same purposes.

... which is a problem (because, precisely, they don't have the same meaning).

For example, using RDF* or Singleton Properties for asserting the certainty of a triple is plain wrong, because in these proposals, any "quoted" triple is also automatically asserted [1]. So writing

   <<:earth :is :flat >> :certainty 0.0 .

does not make much sense, because it also entails :earth :is :flat.

I think that, for each design pattern, a list of associated anti-patterns should be provided, showing how they fail to capture the intended semantics, and therefore justifying the pattern.

[1] for RDF*, things have become a little more nuanced, but at least what I wrote is true for the original proposal

william-vw commented 4 years ago

@pchampin

For example, using RDF* or Singleton Properties for asserting the certainty of a triple is plain wrong, because in these proposals, any "quoted" triple is also automatically asserted [1]. So writing

   <<:earth :is :flat >> :certainty 0.0 .

does not make much sense, because it also entails :earth :is :flat.

This is certainly not the case for the paper where RDF was introduced, since it defined the RDF semantics in the form of a transformation to RDF using its standard reification vocabulary. And, in general, RDF reification does not entail the original triple (or vice-versa). RDF reification is also explicitly defined as not being a quoting of a triple.

pchampin commented 4 years ago

@william-vw My statement was based on one of Olaf's following papers, where he has a dedicated section on redundancy, that clearly says that embedded triples can be considered to be also asserted. To be fair, nowadays he considers two "modes" for RDF*:

I was under the wrong impression that PG had come first. I stand corrected.

william-vw commented 4 years ago

@pchampin Fair enough and this is why I mentioned the original paper.

But personally, I think there is a bit of ambiguity in the follow-up paper vis-a-vis the assertion of embedded triples. (Disclaimer: I did not have a very detailed look.)

Definition 11 with its mapping from RDF to RDF does not seem* to assert the embedded triples, but rather converts them directly to standard RDF reification, which does not entail the original statements.

At the same time, you are right in that Definition 4 mentions that explicitly asserting an embedded triple t' is a form of redundancy. Before that, the authors state that they consider both options to be equivalent in terms of the "information content" carried by G*. But, the section on redundancy seems more geared towards performance gains than concrete semantics.

william-vw commented 4 years ago

perhaps @doerthe could weigh in as well - since she knows much more about RDF* than me :-)

domel commented 4 years ago

First of all, I think we slightly deviated from the topic. :-) Secondly, we don't have RDF specification yet. I try to keep up to date with RDF and I think it is worth analyzing this approach chronologically. The fact that one "mode" is described in one of the first papers does not mean that there are modifications in subsequent iterations. Thirdly, in one of the previous issues, we decided to defer RDF*.