w3c / rdf-star

RDF-star specification
https://w3c.github.io/rdf-star/
Other
118 stars 23 forks source link

What is the graph of a quoted triple? #275

Open tpluscode opened 1 year ago

tpluscode commented 1 year ago

Related to #49, there has been a discussion in rdfjs/N3.js#311 and rdfjs/types#34 about representing quoted triples.

The winning alternative is to assume that a quoted triple << :a :b :c >> gets parsed as a Quad in default graph. I have been trying to express my concern that it would cause subtle error when code attempts to assert a quoted quad. Without additional transformation, that quad would be added to the default graph indeed, which may not be obvious if the "parent" triple is asserted in a named graph.

Bottom line, I have been trying to pass a proposal that an asserted triple is not treated as a quad and would require explicitly selecting the graph when asserting. Either default or named.

Also related, PR rdfjs/types#35

What are your thoughts?

tpluscode commented 1 year ago

cc @blake-regalia

afs commented 1 year ago

A quoted triple does not have a "graph", just like a literal 123 does not have a graph.

One of the outcomes of the CG, RDF* to RDF-star, was recognising that apps may wish to refer to triples in other places without asserting them locally.

RDF does not allow a triple to be "unasserted", so the starting point is a quoted triple (and not assert it as is the case for RDF*) that an app can also assert if it wishes to commit to the statement.

Addition triples can provide more information like "seen in document at some URL on timestamp" or explain why a triple is "true" or assert the triple as well as have a quoted triple (a reference to the triple).

There are many use cases and no consensus how to address each of them (if indeed there is one solution and not several depending on usage).

Instead, RDF-star provides a building block which is incremental to RDF.

woutermont commented 1 year ago

We have to make a clear distinction between specification and implementation.

According to the RDF-Star specification, there exists no such thing as a a quoted quad. As such, @tpluscode is quite right.

It is importent, however, to see that this is not a conceptual limitation: RDF-Star merely restricts its quoted statements to triples because it does the same with its asserted statements. It is inherently built on a conceptual system of triples (~ RDF 1.0), rather than quads (~ RDF 1.1). The concrete syntaxes TriG-Star and N-Quad-Star have been added afterwards, and have been treated as little more than cosmetic facade (defined as Turtle-Star/N-Triples-Star plus quads.)

Meanwhile, with implementations of RDF 1.1 the groundwork has fundamentally shifted towards the quad as basic unit. A triple, in that context, is just a quad without its graph term (which can thus be assumed to be the DEFAULT graph). Merging this with a triple-based specification extension is not trivial, but in this case quite obvious.

A conceptually sound merge of RDF-Star with the quad-based framework would have both the asserted and the quoted statements be quads.

(Edit: corrected serialization names on hint of @TallTed)

tpluscode commented 1 year ago

Speaking of implementations, dotNetRDF in fact does not have quads but instead a Graph object is a container of triples.

It was quite a shift for me to switch to RDF/JS where the Quad object always carries the graph information along

woutermont commented 1 year ago

To react to @afs :

A quoted triple does not have a "graph", just like a literal 123 does not have a graph.

This is a bad analogy. A triple is a statement, a litteral isn't.

RDF does not allow a triple to be "unasserted", so the starting point is a quoted triple [...] that an app can also assert if it wishes to commit to the statement.

"Asserting" a triple is somewhat vague in this sense. If I have a triple in my mind, have I asserted it? If I put it in code, do I assert it? If I print it out only for myself, do I assert it?

To me, an "asserted" triple/quad is one that has been added to a dataset. This is a clear boundary between one state of the quad and another. So I can speak about unasserted quads, e.g. when formulating them, and asserted quads, which are then the elements of the graphs of a dataset. Note that I do not hold a mere graph with quads to be asserted, since a graph is part of the formulation.

afs commented 1 year ago

@woutermont

A triple is a statement,

I said "quoted triple". Not "triple".

A quoted triple is an RDF term. A literal is an RDF term. They also happen to both self-denote - that will be a discussion in the WG I am sure.

Building on quoting is driven by the fact RDF can not "turn off" assertion (a claim a fact is true). So to cover both asserted and unasserted usages, the thing being referenced in syntax <<>> has to be unasserted, until the app makes a positive decision to commit to it.

If I have a triple in my mind, have I asserted it?

No. Well, not unless your mind is a (mathematical) set of triples.

Assertion is terminology in RDF.

Asserting a triple is indeed adding to a graph. By doing so, the person creating the graph is making a commitment to the positive statement - a claim that something is true.

But the inside 3 elements of <<>> are not also adding a triple to the graph. A graph is a set of triples.

<<:s :p :o>> :q :r is one triple.

The original RDF* both asserted a triple based the inside of <<>> and used it as a RDF term - RDF-star does not.

one state of the quad and another.

I don't understand how a quad - a 4 tuple of RDF terms - can have a state. Datasets do not add context. You can use them that way but that's a decision, not required by the definition of RDF database.

woutermont commented 1 year ago

@afs, you are right though about there being a more precise definition of "assertion" than I thought. Thanks! 👍 I also agree with your characterization of assertion vs unassertion, and nothing that I said contradicts what you say about quoted triples not being added to a graph.

However, if a triple is a statement, then a quoted triple is as well; the whole point of RDF-Star is to be enable statements about statements. And given that we think about statements as having a context, it is only logical to conceptualize the quotation of a statement with context. That's all I'm saying. I suggest reading the discussion in https://github.com/rdfjs/types/issues/34 for many useful examples and analogies.

The relevance of a dataset in this case is that it makes statements switch from being mere possibilities (without relevance or connection to other statements), to a definite set of assertions in one or more graphs. It is the dataset which gives meaning to the graph name (or absence thereof). This is the reason why I find it better to interpret "assertion" as the moment of adding a quad to a dataset. Up till then, graph "names" are merely meaningless strings.

afs commented 1 year ago

Be careful about the word "statement". A triple is part of the data model, statements are more about the interpretation of the data model. As RDF has gone through versions, the language has moved from the conceptual "model" and "statement" to a data model of "graph" and "triple". https://www.w3.org/TR/rdf11-concepts/#resources-and-statements

if a triple is a statement, then a quoted triple is as well;

That does not follow. A triple is not a statement.

"Quoted statements" would be (are nearer to) RDF reification. Note a reification has an rdf:type of rdf:Statement - in the domain of discourse.

RDF Datasets originated as a compromise to support several different common practices/use cases from triple stores. The graph name is only for the graph access mechanism in the dataset. The (SPARQL and now RDF 1.1) spec does not mandate an interpretation other than as a container of graphs. No relevance here.

By saying "we think about statements as having a context" you are appealing to a higher level than the data model to have a concept of context.

https://github.com/rdfjs/types/issues/34 starts with a claim (not a fact):

there is no convention on how to handle the graph term in quoted triples when implementing the RDF-star CG spec.

There is no graph term so it does not have to be handled (if there is - point to the spec text that justifies it - we're basing our work on specs).

This is noted at https://github.com/rdfjs/types/issues/34#issuecomment-1330186049 and other comments.

(None of this new - the claim has been made before, and responded to before.)

Definitions:

A "quoted triple" is a new kind of RDF term.
https://w3c.github.io/rdf-star/cg-spec/editors_draft.html#dfn-quoted

A triple is RDF is 3 tuple - subject-predicate-object.
https://www.w3.org/TR/rdf11-concepts/#dfn-rdf-triple

woutermont commented 1 year ago

Be careful about the word "statement". A triple is part of the data model, statements are more about the interpretation of the data model. As RDF has gone through versions, the language has moved from the conceptual "model" and "statement" to a data model of "graph" and "triple". https://www.w3.org/TR/rdf11-concepts/#resources-and-statements [...] A triple is not a statement.

It might have been less ambiguous if I picked the mathematical synonym 'formula', but 'statement' has a rather strict and well-known meaning in logic and formal linguistics. They are the subset of expressions of a language we reason about. In this sense, a triple is the fundamental kind of statement of RDF.

RDF Datasets originated as a compromise to support several different common practices/use cases from triple stores. The graph name is only for the graph access mechanism in the dataset. The (SPARQL and now RDF 1.1) spec does not mandate an interpretation other than as a container of graphs. No relevance here.

This is simply not true. The discussions resulting in the Datasets note clearly concluded not to decide on the semantics of them. Contexts are therefore still one of the possible semantics.

There is no graph term so it does not have to be handled (if there is - point to the spec text that justifies it - we're basing our work on specs). This is noted at rdfjs/types#34 (comment) and other comments. (None of this new - the claim has been made before, and responded to before.)

Since you obviously do not get what I claim, I find it hard to see how you would know it had been made before (though I would find it intereseting to read any pointers you can give, of course). Contrary to what you seem to think that I say, I do NOT claim that an RDF-Star triple has a graph term; I never did. In fact, if you scroll back to my very first comment here, you will (re?)notice that I said so almost literally: "We have to make a clear distinction between specification and implementation. According to the RDF-Star specification, there exists no such thing as a a quoted quad."

What I DO claim, is that a (unique) extension of RDF-Star can be imagined in which quads CAN be quoted. Moreover, I claim that for all practical purposes, I won't hurt to already interpret a quoted triple as if it were an imaginary quoted quad linked to the default graph.

afs commented 1 year ago

Contexts are therefore still one of the possible semantics.

yes - "one of". Can be build on top. The specs do NOT mandate them.

woutermont commented 1 year ago

It doesn't have to for my use of the term to be correct.

TallTed commented 1 year ago

@woutermont -- In your https://github.com/w3c/rdf-star/issues/275#issuecomment-1333635411, you referred to TriG*, N-Quad*, Turtle*, and N-Triples*. Please note that just like RDF*, these names are not used in current work on RDF-star, and will cause confusion.

Please always use the current names (optimally, including an edit of your comment, above), i.e., TriG-star, N-Quad-star, Turtle-star, and N-Triples-star, to be sure everyone is discussing the same thing (among other problems to be avoided).

hartig commented 1 year ago

What I always find weird in these discussions is that the word "quad" is used as if there was a well-defined concept in RDF that is named "quad". There is not; neither the RDF specs nor the SPARQL specs introduce or even use that term. The RDF data model is about triples, sets of triples (called RDF graphs), and collections of such RDF graphs (called RDF datasets), where each of the graphs within such a dataset--except for one of them, called the default graph--is associated with an IRI (which is called the name of the graph).

So, aiming to discuss an "extension of RDF-Star [...] in which quads CAN be quoted" is somewhat meaningless because there is no such concept of "quads" in RDF or in RDF-star.

I understand that some implementations for storing RDF datasets use a 4-element data structure to record the fact that a particular (named) graph of the dataset contains a particular triple. However, such a 4-element data structure is an implementation-specific concept that should not be confused with the concepts defined by the RDF data model or by the RDF-star data model (which is an extension of the RDF data model).

woutermont commented 1 year ago

I appreciate the to-the-letter-ness of your explanation, @hartig, in which you're obviously right that the RDF specification does not speak of a "quad". It is, however, more than merely an implementation detail. You could write an RDFq specification that mentions only "quads", not "triples", yet is an isomorphic structure to the triple-based RDF. With "quad", we thus simply mean a "triple and its graph".

Edit: I do see the merit, though; if we are mulling over some finer details of the spec, it would probably help to use common language upon which everyone agrees. I'll try to build in that reflex next time I write "quad".