w3c / rdf-star

RDF-star specification
https://w3c.github.io/rdf-star/
Other
119 stars 23 forks source link

add section on media-types #161

Closed pchampin closed 3 years ago

pchampin commented 3 years ago

I tried to summarize our discussions. Feel free to comment or suggest changes.


Preview | Diff

rat10 commented 3 years ago

The following sentence seems to mix issues:

On the other hand, if the same server applies the pessimistic approach, it may simply reject any query requiring text/turtle , just in case the result contain embedded triples . Note that it is not always feasible for the server to decide beforehand whether the result is plain Turtle or Turtle-star, because the result is often produced in a streamed manner, after the headers containing the media-type have been sent to the client.

The "same server" refers to a "SPARQL-star endpoint" mentioned before. Nobody should expect Turtle (as opposed to Turtle-star) from a SPARQL-star endpoint and the SPARQL-star endpoint can be expected to refuse such request. This is not really a problem. But a SPARQL-star endpoint could encode embedded triples with RDF standard reification syntax, right? (I brought that up during the last call already but wasn't able to fully understand Pierre-Antoines answer) So there would be a fall-back position for a SPARQL-star server that tries to answer with Turtle and while streaming a response encounters an embedded triple in his data where he didn't expect to find any. Which makes me wonder why a SPARQL-star server should operate under the impression that it doesn't contain embedded triples.

Consequently I find the above cited issue quite contrived: a SPARQL-star endpoint should assume that its answer may very well contain embedded triples and therefor refuse to answer requests for Turtle right away if it isn't prepared to convert embedded triples into some form of reification syntax.

If the above cited sentences are deleted (as I'd suggest) then the following paragraph should begin with "A" instead of "Another" and the whole section IMO becomes much clearer.

pchampin commented 3 years ago

@rat10

Nobody should expect Turtle (as opposed to Turtle-star) from a SPARQL-star endpoint

I have to disagree with that. I expect SPARQL-star endpoints to be, as much as possible, backward compatible with SPARQL clients, so that they are not left behind when severs moves from RDF to RDF-star. Do you expect all existing clients to stop working with dbpedia.org/sparql once they migrate??

a SPARQL-star endpoint could encode embedded triples with RDF standard reification syntax, right?

It could, but this is a mitigation rather than a full-blown solution:

1) the encoding is not semantically "neutral" (the resulting graph is not semantically equivalent the original RDF-star graph) 2) the enconding is lossy, because RDF-star graphs may also use the reification vocabulary (see below) 3) this could lead to seemingly inconsistent results (see below)


To illustrate the latest two points, consider a SPARQL-star endpoint containing the following data:

<< :superman a :Hero >> rdf:subject :clark.

and the following query:

CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o }

The encoded result would be:

_:t01 rdf:subject :superman, clark;
      rdf:predicate rdf:type;
      rdf:object :Hero.

which conflates "encoded" reification properties with "original" reification properties, illustrating point 2 above.

Now consider the query:

SELECT (COUNT (?o) as ?o) { ?s ?p ?o }

I would expect the server to respond "1", as the any client (including old ones) is able to consume that result. But that would be inconsistent with the result of the previous query. This illustrates point 3 above.

rat10 commented 3 years ago

This discussion probably belongs to the main thread (issue #43), so I'll try to be brief:

And to add some not so brief arguments:

pchampin commented 3 years ago

@rat10 Arguably, since I lean towards the optimistic approach, I see more cons for the other approach. I will happily include additional cons for it (or pros for the pessimistic approach) if anyone suggests them. But I am not inclined to removing a valid argument just for the sake of keeping the size of pros and cons balanced.

I do not understand the purpose of your example << :superman a :Hero >> rdf:subject :clark.

I grant you that it is a strange graph. I favored conciseness above likeliness. Still...

As far as I understand the proposed semantics that is the same as saying yes means no.

No it is not. This graph is absolutely satisfiable under the proposed semantics.

What useful inferences would you expect from that?

None, really. But this is a valid (and satisfiable) RDF-star graph, so the proposed solution should work with it as well as with any other RDF-star graph.

I do understand that at the core of the problem you describe is referential opacity.

Not at all... The core of the problem is that converting such a graph using the reification vocabulary, as you suggested, ends up losing information: reconstructing the original RDF-star graph is ambiguous. The following RDF-graph, which I hope you agree is a different graph from my original example, would produce the exact same "encoding":

<< :clark a :Hero >> rdf:subject :superman.
lisp commented 3 years ago

the phrase

using the standard media-types as if they had been already updated

is suspect.
would you consider the phrase, "using existing media type identifiers as if their definitions had ..."?

rat10 commented 3 years ago

@pchampin IMO there have been brought enough arguments for both approaches and the resolution really depends on if one is "optimistic" or "pessimistic" about the success of RDF-star. That would better be reflected in a balanced list of pros and cons.

W.r.t. the examples you've given I think what they really show is that it is at best hard to make proper sense of the proposed semantics. But let's not further go into this discussion here as I've started to work on another long-ish take on this issue.

pchampin commented 3 years ago

@pchampin IMO there have been brought enough arguments for both approaches

I honestly tried to reflect all the arguments raised, to the best of my understanding. Again, any PR or suggested addition/clarification is welcome.

and the resolution really depends on if one is "optimistic" or "pessimistic" about the success of RDF-star.

+1 on that

W.r.t. the examples you've given I think what they really show is that it is at best hard to make proper sense of the proposed semantics.

The examples I gave were merely to illustrate a syntactical problem, namely that the reification-encoding would be lossy. With all due respect, this is totally orthogonal to the semantic.

afs commented 3 years ago

I think the proposed text reflects the community discussions.

afs commented 3 years ago

Nobody should expect Turtle (as opposed to Turtle-star) from a SPARQL-star endpoint

There are "endpoints" - URLs. They get upgraded. So one day it would be SPARQL 1.1, and next day SPARQL-star.

I would expect a high desire that an existing endpoint to continue to support its existing users with the same MIME type for the same queries, i.e. not cut off old clients, if it is upgraded.

There is an argument that the minimal MIME type should be used.

IF it is turtle, and the request is */* or text/turtle, text/turtle-star THEN the least MIME type be used. But that has a significant practical impact as discussed in the proposed text. I think the point is already covered.

But a SPARQL-star endpoint could encode embedded triples with RDF standard reification syntax, right?

  1. SPARQL result sets
  2. There needs to be an isomorphism because it needs to work for POSTing arbitrary reification and it appearing in RDF-star requests. But without a "RDF-star entailment regime" and bnodes for reification subjects, or syntactic restrictions to given one resource and one stating (reification-speak) for one embedded triple, (or some other device) it is not an isomorphism.
rat10 commented 3 years ago

@afs I'm out of my depth here and I currently can't fill the gaps in my knowledge fast enough. What seems quite clear to me is that there is no ideal solution as the RDF space lacks proper versioning mechanisms. So the pros and cons weigh differently depending on if one thinks the RDF-star proposal will eventually become part of RDF or not and if one feels entitled to push RDF-star into RDF. The decision with respect to namespaces for new attributes was against using the RDF namespace because we didn't want to be pushy, acknowledging Dan Brickley's polite hint that there exist other proposals in this space as well. I don't see what's so different w.r.t. MIME types. Also, as is well known, I think the proposed semantics will not stand the test of time and won't make it into RDF 2. Not versioning embedded triples with referentially opaque semantics by proper MIME types will not make life any easier lateron. Third this CG refused to go into issues like named graphs with the argument that defining RDF 2 is out of scope. That is fine, but then I think you have to live with this decision that you are a very focused effort and define seperate namespaces, MIME types etc for RDF-star and think of the next years - until work on RDF 2 matures - as a test phase. You can't have it both ways. And that I'm afraid is probably all I can really contribute to this matter.

pchampin commented 3 years ago

I don't see what's so different w.r.t. MIME types.

Simple: there was a general consensus on the IRI issue (let's mint one outside the rdf: namespace), but there was no consensus on the best way forward with MIME types.

Following @gkellogg's suggestion, I added an explicit mention in the report of the absence of consensus on that point, making it clear (hopefully) that the report does recommend one approach over the other.

pchampin commented 3 years ago

This was discussed during today's call: https://w3c.github.io/rdf-star/Minutes/2021-04-30.html#t03

rat10 commented 3 years ago

@pchampin

I don't see what's so different w.r.t. MIME types.

Simple: there was a general consensus on the IRI issue (let's mint one outside the rdf: namespace), but there was no consensus on the best way forward with MIME types.

Haha, so funny. Of course I meant: why take a different stance in this case than in that one.

Following @gkellogg's suggestion, I added an explicit mention in the report of the absence of consensus on that point, making it clear (hopefully) that the report does recommend one approach over the other.

I assume you mean "does not recommend".

Not making any recommendations and making the examples all refer to well established file extensions seems like an unbalanced way to handle this topic.

The service descriptions that we discussed in today's (erm, yesterday's) call aren't visible enough to really make a difference and don't help with files.

How about listing "possible" (not "recommended") x- MIME types that people MAY use even if there is no consensus that the CG recommends their use (as in SHOULD)? This would at least allow people that care for that sort of responsible and prudent behaviour to use a MIME type readily available and not having to invent one (and probably just a little different from the next person with the same problem etc).

This is not a topic that I had expected to get particularily passionate about but I can't help but find it quite unresponsible and even rude to start throwing embedded triples at people (and softwares) without prior warning when they expect standard RDF. When all this has gone through the years-long cycle of RDF standardization (and ironing out eventual issues etc): sure. But just like that, in an euphoric act of self-empowerment? That seems greedy to me, and pushy. I could much better live with a few namespaces that some day are no longer supported because that certainly has much less potential to inflict unexpected harm on anybody.

lisp commented 3 years ago

the phrase

using the standard media-types as if they had been already updated is suspect. would you consider the phrase, "using existing media type identifiers as if their definitions had ..."?

"their definitions" is better than "they".

lisp commented 3 years ago

the discussion on the twenty-third also included the suggestion, to introduce a header which indicates that the client is prepared to accept rdf&sparql-star content in addition to that which adheres to the standard definitions for the respective media type.

were a request to include the header "Accept-Star" with a value analogous to that of an "Accept" header, the server would be licensed to treat the indicated media types as if their definitions had been revised to reflect the approach described in the report. absent that header, the server would be constrained to either fall back to reification or fail requests as not acceptable where reification is not possible and the content can be predicted to be -star media.

pchampin commented 3 years ago

@lisp I think the commit above addresses your last two comments

pchampin commented 3 years ago

@rat10

Haha, so funny. Of course I meant: why take a different stance in this case than in that one.

As it is a sum of individual positions, of which I am just one, I can not answer to this question. That's just what I meant.

The service descriptions that we discussed in today's (erm, yesterday's) call aren't visible enough

That's the point of another action #164 (but any PR is welcome...).

How about listing "possible" (not "recommended") x- MIME types that people MAY use even if there is no consensus?

I could be ironic and point out that, then, we should get consensus on what x. media-type we put in the document... But more seriously: no matter what language we put around, minting alternative media-type identifiers will lead to people using them (all the more if we use them ourselves in our examples and test suite). It is a form of endorsement that, in my opinion, goes beyond the current lack of consensus.