fix: problem with assumption that "quoted triples" are quads

tpluscode commented 1 year ago

Based on the discussion in #34 I came to the realisation that when RDF* calls "quoted triple" is in fact not a quad. Consider a simple graph:

PREFIX : <http://www.example.org/> 

<< :a :name "Alice" >> :statedBy :bob .

Currently, this cannot be exactly represented by RDF/JS, because we assume that :a :name "Alice" is a quad (spog), which it is not. It is only a triple. Thus, the code below should fail. At least by type checking

const dataset: DatasetCode = parseAboveExample()

const [ quad ] = dataset // there is only one quad

// Type Error: subject is a quoted triple, which should not be assignable to Quad
dataset.add(quad.subject)

This PR reintroduces Triple type which is simply Quad with graph removed (and its counterpart BaseTriple).

Note that while this is seemingly a breaking change, in fact it will not affect existing implementation (however they choose to interpret the graph property of a quoted triple). Typescript users will find this a breaking change but IMO its is closer to a bug fix of incorrectly interpreting RDF* spec when we first defined it in our type definitions

changeset-bot[bot] commented 1 year ago

⚠️ No Changeset found

Latest commit: 6c74395e8739787f4130b2ae426c321ad7d39e79

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

woutermont commented 1 year ago

As repeatedly argumented in the same issue #34, quoted triples are quads, just like all other triples. The fact that their graph information is irrelevant for some operations does not mean it should be left out.

jeswr commented 1 year ago

I agree with @woutermont here. I also want to echo https://github.com/rdfjs/types/issues/34#issuecomment-1331408661 in that it is ok for these interfaces to be more general than what is allowed by the spec. It is something that is done already in RDFJS with one example being BaseQuads allowed to have any type of Term as subject/predicate.

The introduction of a new termType is not necessary here. We know it is quoted based on the fact that it is used as a term in the Quad. If we were to make any interface updates, my suggestion would be

export type Quad_Subject = NamedNode | BlankNode | Triple | Variable;
export type Quad_Predicate = NamedNode | Variable;
export type Quad_Object = NamedNode | Literal | BlankNode | Triple | Variable;
export type Quad_Graph = DefaultGraph | NamedNode | BlankNode | Variable;

export interface Triple extends Quad {
    graph: DefaultGraph; // or null if we did go with option 1
}

But as this is breaking I suggest that we do not make such a change at present; especially since the RDF-star WG has only just started.

tpluscode commented 1 year ago

We know it is quoted based on the fact that it is used as a term in the Quad

Well, that's the thing. We don't:

const fact = $rdf.quad(Ruben, says, $rdf.quad(elephants, color, "yellow"))

console.log(isQuoted(fact))

function isQuoted(quad: Quad): boolean {
  return ??
}

Inside the function isQuoted you have no way of telling that it is a quoted triple. Options are

graph: null
new termType
remove graph altogether

Here, I opt for 3, because it is closes to the terminology of "quoted triple" which is not a quad.

woutermont commented 1 year ago

We can keep on repeating this, but a triple/quad is neither quoted nor asserted unless in a specific moment of usage. So it is perfectly normal not to "know" of a quad which "kind" it is: there are no kinds, only usages.

jeswr commented 1 year ago

Well, that's the thing. We don't:

That code snippet I would equate to trying to do

const term = $rdf.namedNode('http://example.org#Jesse')

console.log(isSubject(fact))

function isSubject(term: Term): boolean {
  return ??
}

whether or not it occurs in the subject of any Triple/Quad is completely down to the context of its usage. It may occur as the subject in one Triple/Quad and the object in another; it might not be used anywhere in your dataset.

In the same way I could have

const fact1 = $rdf.quad(Jesse, a, Person)
const fact2 = $rdf.quad(Ruben, says, fact1)

store.addQuads([
   fact1,
   fact2
])

Here fact1 is an asserted Triple/Quad in the case of the first fact added to the store, and a quoted triple in the case of the second fact in the store. As @woutermont says above fact1 is not inherently quoted or asserted. It is just a matter of where it is being used.

EDIT Fixed to use fact1 as a term in fact2

tpluscode commented 1 year ago

You subject is a good analogy. In the default Quad interface you will get an error when you try this:

const quad1 = $rdf.quad(knows, a, Property)

const quad2 = $rdf.quad(Jesse, quad1.subject, Wouter)

This may be correct, but the types Quad_Subject and Quad_Predicate are incompatible. This definition in types prevents triple with blank node predicate

My proposal addresses a similar problem, which may cause unexpected results when you take a quoted triple and try to assert it in your dataset. If you take it at face value, it may end up not in the graph you expect. This will inevitably lead to bugs.

jeswr commented 1 year ago

@tpluscode - I just realised I made a mistake in https://github.com/rdfjs/types/pull/35#issuecomment-1333372960 and have now fixed it. The entire point I was trying to make was that fact1 can be both quoted and asserted based on the context it is used. Note that what I have above is now equivalent to

Jesse a Person
Ruben says <<Jesse a Person>>

tpluscode commented 1 year ago

Indeed, I think I noticed that but my brain adjusted to interpret as you intended.

Note again that this is not exactly equivalent to my snippets. You create 2 quads and assert them. No issues there. The problem when you would write instead:

const fact1 = $rdf.quad(Jesse, a, Person)
-const fact2 = $rdf.quad(Ruben, says, fact1)
-const fact2 = $rdf.quad(Ruben, says, fact1, graph)

store.addQuads([
-  fact1,
+  fact2.object,
   fact2
])

Here's a practical example, if that might help. Code which will apply a set of changes to a graph (triples to add and remove).

const $rdf = require('rdf-ext@1.3.5')
const N3 = require('n3')
const ex = require('@rdfjs/namespace@1.0.0')('http://example.com/')
const getStream = require('get-stream@5.0.0')

const dataBefore = `
@prefix : <http://example.com/>.
graph :G {  
    :elephant :color "Yellow" 
}`
const changes = `
@prefix : <http://example.com/>.
graph :G {
    [
      :add << :elephant :color "Grey" >> ;
      :remove << :elephant :color "Yellow" >> ;
    ] .
}`

const parser = new N3.Parser()
const dataset = $rdf.dataset(parser.parse(dataBefore))

for(const mod of parser.parse(changes)) {
  if (mod.predicate.equals(ex.add)) {
    dataset.add(mod.object)
  } else {
    dataset.remove(mod.object)
  }
}

await getStream(dataset.toStream().pipe(new N3.StreamWriter({
  prefixes: { '': 'http://example.com/' }
})))

Run it here: https://runkit.com/embed/v50w7w5v253b and you will get this output

@prefix : <http://example.com/>.

:G {
:elephant :color "Yellow"
}
:elephant :color "Grey".

The result is not what you want, and likely not what you'd expect, with the new triple added to default graph and the removed triple not actually removed. This PR adds types to prevent such mistake and I would strongly consider actually failing in runtime when trying to assert a quoted triple directly like that.

Here's how the code would be modified to match expectation

for(const mod of parser.parse(changes)) {
+ const { subject, predicate, object } = mod.object
+ const modifiedQuad = $rdf.quad(subject, predicate, object, ex.G)
  if (mod.predicate.equals(ex.add)) {
-   dataset.add(mod.object)
+   dataset.add(modifiedQuad)
  } else {
-   dataset.remove(mod.object)
+   dataset.remove(modifiedQuad)
  }
}

jeswr commented 1 year ago

You create 2 quads and assert them. No issues there.

Except for the fact that with the proposed changes https://github.com/rdfjs/types/pull/35#issuecomment-1333372960 would have a type error because fact1 has a termType of Quad but it is expected to be of termType quotedTriple when used in fact2.

The problem when you would write instead:

I now see your point; I'm just not sure whether this is likely to really be a source of error in practice; and I fear it would just result in more time spent dealing with typing headaches + adressing breaking changes across existing libs, than time solved debugging. That said I will think on this over the weekend.

tpluscode commented 1 year ago

Except for the fact that with the proposed changes would have a type error

It shouldn't, because Triple = Quad - Graph. In other words, a quad should be assignable to triple

woutermont commented 1 year ago

The result is not what you want, and likely not what you'd expect, with the new triple added to default graph and the removed triple not actually removed.

@tpluscode I think that elaborate example pictures very clearly what you have been trying to say, but I still do not agree. The result is exactly what I would expect, and what I would want when writing those lines.

You start from this dataset:

:elephant :color "Yellow" :G .

You loop through the changes:

_:blank :add << :elephant :color "Grey" >> :G .
_:blank :remove << :elephant :color "Yellow" >> :G .

Notice how the graph name :G is only part of the assertion, not of the quotation. The quoted triples are thus logically placed within the :default graph, resulting in this dataset:

:elephant :color "Yellow" :G .
+ :elephant :color "Grey" :default .
- :elephant :color "Yellow" :default .

If we wanted to remove the initial triple, we would need a more expressive language, allowing us to write:

_:blank :add << :elephant :color "Grey" :G >> :G .
_:blank :remove << :elephant :color "Yellow" :G >> :G .

Exactly because of this need for more expressiveness, I expect such a language to rise (probably as an RDF-Star extension), which is why I argue that we should untill then just use the default graph for quoted triples, as this would be in line with the expected behavior of such a language. Adding a more complex type system (e.g. with Triples) now would only have to be reverted when we reach that point in time.

tpluscode commented 1 year ago

The quoted triples are thus logically placed within the :default graph

This is exactly where we differ, and IMO you deviate from the RDF-star spec. The quoted triple is placed in graph :G of the quad which quotes it, if anywhere. Which necessitates to repeat, that a quoted triple has no context.

Consider a hypothetical REST API scenario, in which these changes are sent as a turtle document, such as the difference triples would not necessarily be in the same graph :G (here assumed to be the resource identifier /resource/G)

PATCH /resource/G
Content-Type: text/turtle

_:blank :add << :elephant :color "Grey" >> .
_:blank :remove << :elephant :color "Yellow" >> .

It is unspecified how the body is parsed, likely as a set of triple in default graph

[...] we would need a more expressive language

Maybe, but that is not the language we have.

Exactly because of this need for more expressiveness, I expect such a language to rise

I would not be so sure. It is designed that way for good reason and if you read how the triple vs quad is interpreted at the core of RDF, you should expect the only triples will every be "quoted".

Adding a more complex type system (e.g. with Triples) now would only have to be reverted when we reach that point in time

Not necessarily. I expect a quoted triple would still have that ambiguous semantics due to the syntactic impossibility of distinguishing a triple without context from a triple in default graph. Probably another reason why "quoted quads" will never come to pass.

bergos commented 1 year ago

@tpluscode Like in any other RDF/JS triple vs. quad discussion, I would like to know the problem you can't solve? You build the named graph from the HTTP path and headers, but mapping the quoted triple from the default graph to the target graph is a problem? You can do it with the current interfaces. Not having an explicit mapping step in the patch code could lead to readability problems.

tpluscode commented 1 year ago

Like in any other RDF/JS triple vs. quad discussion, I would like to know the problem you can't solve?

I think you got it backwards. As outlined in my earlier comments, I do not want to solve problems but avoid them

woutermont commented 1 year ago

[@woutermont:] The quoted triples are thus logically placed within the :default graph

[@tpluscode:] This is exactly where we differ, and IMO you deviate from the RDF-star spec. The quoted triple is placed in graph :G of the quad which quotes it, if anywhere. Which necessitates to repeat, that a quoted triple has no context.

Of course I do! But so do you 🙃 The difference is that I explicitly say so in practically every comment.

Let's retrace our steps, with regard to @bergos' question about which problem we are trying to solve.

Initially, @jeswr wondered how to handle the graph term in quoted triples (#34). While an RDF-Star quoted triple indeed has no graph term, the initial reasoning was purely pragmatic, since RDFJS has up till now done everything with quads. This is precisely the question that the HTTP example of @tpluscode exemplifies.
The initial consensus seemed to go to assign quoted triples the default graph: interpret :s :p << :a :b :c >> :g as :s :p << :a :b :c :default >> :g
@tpluscode, amongst others, pointed out the lack of connection with the specification, and highlighted a pragmatic issue with the proposal: inattentive programmers might make the mistake of adding a triple that was quoted to a dataset (objectOf(:s :p << :a :b :c >> :g)), where it would then appear in the default graph, which would be unintuitive to @tpluscode.
Stepping from the pragmatical level to the theoretical one, I then attempted to show how the choice for the default graph is not just a random one, but the semantically intuitive one. To answer to @bergos: it allows for a clean extension of the framework to one where object-sentences are just as expressive as meta-sentences. In more concrete terms: envisioning a semantically probable future (because expressiveness is lacking) where quoted statements can have a graph term, it only makes sense to treat current quoted triples as already possessing the default graph (i.e. the same as we treated triples when switching to RDF 1.1 syntaxes).

To respond to @tpluscode's criticisms:

[...] we would need a more expressive language

Maybe, but that is not the language we have.

That is not the issue. I am not claiming we should write to another language. I am simply making some theoretical, semantic arguments that agree with the practical intuitions of @jeswr and @rubensworks, rather than with yours.

Exactly because of this need for more expressiveness, I expect such a language to rise

I would not be so sure.

While one can never be 100% sure, it is a fairly safe bet to say that [A] if expressiveness is lacking a language will rise that enables it, or more general [B] if something is useful, someone will build it, and people will use it.

It is designed that way for good reason

Everything RDF-Star adds is technically possible in RDF itself (heck, even RDF 1.1 is entirely reducable to RDF 1.0 I believe). It's all syntactic sugar, just as my fictional extension of them, so they all share that same "good reason".

if you read how the triple vs quad is interpreted at the core of RDF, you should expect the only triples will every be "quoted".

If we expect everything to stay the same based on what is explicitly written, we would live in a very poor world. Nowhere in RDF 1.0 did it say "dataset", yet then we got RDF 1.1; but nowhere in that one could I read "quoted triple", and still we got RDF-Star. Now, since RDF-Star does not seem to prohibit extension, I am fairly certain of my expectation for "quoted quads" (even if that will not be their name).

rdfjs / types

fix: problem with assumption that "quoted triples" are quads #35

⚠️ No Changeset found