hartig commented 3 years ago

In addition to defining the extended formats for serializing the result of a SPARQL* SELECT query (#12 and #13), we have to decide whether we need/want new mime types for these extended formats? Similarly, do we need/want to introduce another namespace for the extended XML result format?

pchampin commented 3 years ago

I think we do need a new media type for these formats. We don't want an RDF-unaware client request an SPARQL endpoint, and choke on the results it gets from the server.

I would keep the same XML namespace, however, as we essentially only extend the original one.

gkellogg commented 3 years ago

It might be abusing the system, but given the number of different mime types potentially affected, perhaps a profile parameter would be easier to handle. We’ve used these in JSON-LD, for example.

abrokenjester commented 3 years ago

@gkellogg A difference between JSON-LD profiles and what we're doing here is that with JSON-LD, regardless of the profile used, the document produced is always syntactically valid JSON-LD. Whereas here we are extending the syntax in a way that makes it actually syntactically invalid when considered by a non-extended parser.

gkellogg commented 3 years ago

@gkellogg A difference between JSON-LD profiles and what we're doing here is that with JSON-LD, regardless of the profile used, the document produced is always syntactically valid JSON-LD. Whereas here we are extending the syntax in a way that makes it actually syntactically invalid when considered by a non-extended parser.

Not entirely, the profile http://www.w3.org/ns/json-ld#framed (see IANA Considerations can be applied to application/ld+json for a frame document, which is an extension of JSON-LD allowing things like @embed, which are not otherwise allowed, so there is precedent for doing this.

But, otherwise, consider if we wanted to make a sub-type. This might involve creating the following sub-types (not necessarily suggesting anno, but need something):

text/anno+turtle
application/anno+n-triples
application/anno+sparql-query

But, then you get to things like application/ld+json and application/sparql-results+json. There are proposals for going deeper, but currently, I don't believe this is allowed, so application/anno+sparql-results+json or application/anno+ld+json don't work.

But, if we used a profile, it could be applied uniformly across a variety of mime-types:

text/turtle;profile=https://www.w3.org/ns/rdf-star#anno
application/n-triples;profile=https://www.w3.org/ns/rdf-star#anno
application/sparql-query;profile=https://www.w3.org/ns/rdf-star#anno
application/sparql-query-results+json;profile=https://www.w3.org/ns/rdf-star#anno
application/ld+json;profile=https://www.w3.org/ns/rdf-star#anno

(And, yes, you can specify multiple profiles, IIRC).

abrokenjester commented 3 years ago

Not entirely, the profile http://www.w3.org/ns/json-ld#framed (see IANA Considerations can be applied to application/ld+json for a frame document, which is an extension of JSON-LD allowing things like @embed, which are not otherwise allowed, so there is precedent for doing this.

Fair enough, wasn't aware of that.

In any case I'm not necessarily against the use of profiles, I just thought I'd point what I thought was a distinction. I am tempted by the notion of not having to introduce separate MIME-types for everything yet still having a meaningful way to distinguish.

VladimirAlexiev commented 3 years ago

@gkellogg, GraphDB and RDF4J have defined new MIME types: https://graphdb.ontotext.com/documentation/free/devhub/rdf-sparql-star.html#mime-types-and-file-extensions-for-rdf-in-rdf4j.

I'm against using profiles for this, for the reasons expressed by @jeenbroekstra plus:

the formats are quite different, so using different file extensions is warranted
older clients that are not profiles-capable might get an unpleasant surprise, if the query invokes rdf-star features.

afs commented 3 years ago

@VladimirAlexiev - what do you propose for the issue I mentioned on #55? The same situation happens.

The issue is that one format is an extension/superset of the other, "quite different" does not have that nuance.

What is the MIME type of the result of SELECT * { ?s ?p ?o } without inspecting all the results (i.e. no streaming) in the server. Example: send Accept: */* (such as when using GET with no client or application set headers) expecting to be driven off the response Content-type.

Older clients will get a surprise if it is new MIME type ("just in case")
Forced buffering in the server is a new demand on the server (c.f. timeouts).
"profile" is unlikely to be handled by existing clients unfortunately.

Note that it is not only the client library that is the problem - in some libraries, they let the application set the Accept.

I don't think there is a perfect answer; we are making a "least bad" choice.

afs commented 3 years ago

Another factor is what do we want the end state to be like when all (or a substantial majority) of clients and servers have adopted RDF-star.

MIME types do not go away. Introducing a new MIME type is a permanent commitment.

I can't think of a practical, transition phase if the long term outcome should be one MIME type (existing or new) because transition introduces two points of change - start and finish.

So we have to ask are new MIME types for results or content and file extensions the desired, long term outcome?

c.f. application/x-www-form-urlencoded -- the x- is not going away.

gkellogg commented 3 years ago

In the case of SPARQL, there is some argument that .rq could be different for SPARQL, but not really the result formats. In general, if you make a query using SPARQL features then you would be expecting SPARQL* results. In my experience, the client is the one initiating the request and controlling the query, so content-negotiation for Star features doesn't seem to useful.

That's somewhat different for Turtle, but per Andy's reasoning, I'd be wary of introducing a new content type specifically to enable such features. If RDF 1.2 were to introduce new syntax, not necessarily Turtle, I wouldn't expect that working group to use a new content-type/file extension. And, as I've said before, we didn't do this for JSON-LD 1.1.

In general, it's sufficient for future proofing that clients raise an error if they see features they can't handle, which any non-star Turtle client would do when faced with the new syntax. Worse would be (as was the case for JSON-LD 1.0) that clients would silently ignore new features and generate different results.

TallTed commented 3 years ago

Yet another reason that RDF(etc)* appears to me to be ill-considered.

text/star+turtle? That would follow standard MIME type fallback format ... except that a Turtle file with << :a :b :c >> :d :e or :a :b :c {| :d :e |} can't be parsed as Turtle (no matter the version), so Turtle requires a distinct MIME type, as do all the other RDF* serializations.

I think it's useful to have all considerations on one page, so here's a reproduction of the table in the GraphDB/RDF4J docs --

RDF* format	MIME type	File extension
Binary RDF	`application/x-binary-rdf`	`brf`
Turtle*	`text/x-turtlestar` `application/x-turtlestar`	`ttls`
TriG*	`application/x-trigstar`	`trigs`
JSON query result	`application/x-sparqlstar-results+json`	`srjs`
TSV query result	`text/x-tab-separated-values-star` `application/x-sparqlstar-results+tsv`	`tsvs`

Perhaps it's time to recognize that RDF(etc) is taking shape as a fork of RDF 1.1 (and all its serializations), not an extension of (un-versioned) RDF (nor its serializations), and thus REALLY TRULY HONESTLY needs an entirely distinct name, or a drastic rethink, and pursuit through the sparql-12 project (which isn't really just* about SPARQL). (I'm more in favor of the latter.)

afs commented 3 years ago

I asked:

So we have to ask are new MIME types for results or content and file extensions the desired, long term outcome?

@TallTed - are you proposing that the MIME type of content that may contain RDF-star syntax is text/turtlestar?

(note: x- is not for registration nowadays after the experience of x-www-form-urlencoded)

TallTed commented 3 years ago

I'm not proposing anything. I'm making observations.

The table of MIME types above did not come from me; it was relayed by me from GraphDB/RDF4J.

"The desired, long term outcome" from my seat is that the RDF(etc)* effort be (re-)connected to RDF(etc).

afs commented 3 years ago

The proposal I was referring to is:

Turtle* requires a distinct MIME type

A new MIME is not required; it is an option to consider by working through the advantages and disadvantages. There is no perfect answer here.

RDF-star is one additional feature and current Turtle data remains valid.

Of the 4 cases, "old client - new server" is the hardest. On the web, client and server do not move together where they just might in an enterprise setting if necessary.

Who moves first? What are the consequences?

Consider how to roll out a server which supports RDF-star. What is the MIME type of content that may contain RDF-star syntax? If it is Accept: */*? (default if not set). So buffer until known? That is a significant cost.

Or SPARQL results? When SELECT * { ?s ?p 123 } is truncated by timeout? Or a result that one day is text/turtle and tomorrow is the new MIME type breaking an existing client? Or multiple different endpoints?

TallTed commented 3 years ago

It's generally not helpful to pluck tiny phrases from their larger context. That larger context:

text/star+turtle[...] would follow standard MIME type fallback format ... except that a Turtle file with << :a :b :c >> :d :e or :a :b :c {| :d :e |} can't be parsed as Turtle (no matter the version), so Turtle requires a distinct MIME type, as do all the other RDF serializations [unless we] recognize that RDF(etc) is taking shape as a fork of RDF 1.1 (and all its serializations), not an extension of (un-versioned) RDF (nor its serializations), and thus REALLY TRULY HONESTLY needs an entirely distinct name [and MIME types], or a drastic rethink, and pursuit through the sparql-12 project (which isn't really just about SPARQL [as the issues there also touch more of RDF, and I think should lead to 1.2 or 2.0 across the board of RDF&serializations]). (I'm more in favor of the latter.)

Existing clients will (hopefully) be indicating Accept: text/turtle (or other RDF 1.0 or 1.1 serialization) when they submit their SPARQL query, and since all SPARQL 1.0/1.1 queries are SPARQL queries, well behaved servers will simply generate the result in (or convert the result from whatever RDF serialization to) the requested serialization or refuse to deliver, if they're incapable of that generation or conversion.

Your other questions are worth further exploration in a broader sphere than MIME types and XML namespaces -- and likely should be treated as distinct issues in the overall consideration of whether RDF(etc)star should continue as an RDF fork or reunite with the evolutionary path of "conventional" (i.e., unstarred) RDF(etc).

ericprud commented 3 years ago

The list of RDF-related languages is getting long-ish (more in the W3C wiki):

graphs: RDF/XML¹, (Turtle, Trig, NTriples, NQuads), JSON-LD³, RDFa², Microdata², HDT, Binary RDF. CSVW⁴
query: SPARQL
query results: SPARQL Results {XML¹, JSON³, CSV⁴, TSV⁴}
schema: ShExC, ShExJ³, ShExR, SHACLC (compact syntax)
OWL: OWL/XML¹, Manchester Syntax, Functional Syntax, OWL/RDF

¹ constrained by XML ² constrained by XML attributes ³ constrained by JSON ⁴ constrained by {C,T}SV

I believe that every one of these formats and implementations thereof would have to be updated to accept some encoding of embTriple (a triple in <<>>s). For format-constrained languages, we have to encode embTriple in some way that fits in the format.

~~It may never be worth extending some; NTriples and NQuads really connote conventional graphs (related issue) and~~ use cases may never motivate extending e.g. RDFa or Microdata. But however we prioritize these extensions, it would be preferable to attack this systematically rather than piecemeal(-y?), noting that there could be different formulas for the format-constrained languages.

Apart from @embed in JSON-LD frames, I have the impression that Profiles can be ignored without breaking the left-most '+'-separated unit of a media type, or at least, that's what reviewers on ietf-types assume when they read applications.

Proposals

Using Turtle, JSON-LD and RDF/XML to stand in for the unconstrained, JSON-constrained and XML-constrained formats:

proposal	desc	Turtle	JSON-LD	RDF/XML
null	don't touch media types	text/turtle	application/ld+json	application/rdf+xml
profiles	use profile parameter	text/turtle; profile=http...#star	application/ld+json; profile=http...#star	application/rdf+xml; profile=http...#star
prefixed	add e.g. "star+" before media type	text/star+turtle	application/star+ld+json	application/star+rdf+xml
embedded	add e.g. "star" in media type	text/turtlestar	application/ldstar+json	application/rdfstar+xml

Meta: I (or anyone with edit privs) can edit this to keep it representative of the proposals.

(long live application/x-www-form-urlencoded)

afs commented 3 years ago

@ericprud - nice work, links and all!

We need NTriples and NQuads to write tests case!

ericprud commented 3 years ago

Yeah, I kinda confronted that after I edited this. Given the use cases, I wonder if it's a change to NTriples or another language a lot like NTriples.

TallTed commented 3 years ago

@ericprud --

query: SPARQL, SPARQL Results {XML¹, JSON³, CSV⁴, TSV⁴}

SPARQL Results are a serialization/materialization of results, i.e., of data, not of the query that produced those results.

Therefore, I think this bullet should be reduced to SPARQL, and the SPARQL Results variants should be moved to the preceding bullet.

I don't believe the proposed "prefixed" MIME types fit with the standard MIME rules for fallback interpretation, which use the + to separate newer (subset) left-side formats from older (superset) right-side formats (with the ultimate fallback being the left-side of the / solidus).

E.g., application/ld+json can be interpreted by a newer JSON-LD processor or an older JSON processor can "fall back" to interpretation as plain-JSON, albeit without LD/RDF features.

E.g., application/rdf+xml can be interpreted by a newer RDF/XML processor or an older XML processor can "fall back" to interpretation as plain-XML, again, without LD/RDF features.

To the contrary, Turtle cannot be interpreted as plain-Turtle by an older Turtle processor. I'm expecting that JSON-LD will not be interpretable by a plain JSON-LD processor, and I think it likely that the same will be true for RDF/XML, i.e., that RDF/XML will not be interpretable by a plain-RDF/XML processor, though these a reasonably likely to be (incompletely and imperfectly; i.e., without LD or RDF features) interpretable by plain-JSON or plain-XML processors.

I think this makes the "prefixed" MIME types non-starters for (what I understand to be) their intended purpose.

The proposed profile path might be functionally viable for new tools that know these profiles exist, but fails for older tools that don't know they exist (and may choke on the MIME type, before getting to the data).

Indeed, the RFC you cite says explicitly that what is contemplated here is forbidden --

... A profile MUST NOT change the semantics of the resource representation when processed without profile knowledge, so that clients both with and without knowledge of a profiled resource can safely use the same representation. ...

I don't see what benefit there is to having text/turtle; profile=http...#star associated with data that isn't Turtle, when the Turtle parser is going to choke on the Turtle* data (if it even gets that far, i.e., if it doesn't choke on the MIME type itself) just as badly as it would if the associated MIME type were just text/turtle.

I think this makes the proposed "profile" MIME types a small improvement, if any, over the proposed "prefixed".

The proposed "embedded" MIME types seem generally workable as MIME types, but I don't see them doing anything for usability of RDF-classic tools on RDF-star data.

Which is fine, IFF we embrace that RDF(etc)star is a fork in the road of RDF and related tech.

afs commented 3 years ago

The Turtle tests uses NT : N-Triple with added <<>>.

ericprud commented 3 years ago

@TallTed,

query: SPARQL, SPARQL Results {XML¹, JSON³, CSV⁴, TSV⁴}

SPARQL Results are a serialization/materialization of results, i.e., of data, not of the query that produced those results.

Therefore, I think this bullet should be reduced to SPARQL, and the SPARQL Results variants should be moved to the preceding bullet.

I moved them into their own \

'cause they're primarily tabular structures with terms in the cells.

pchampin commented 3 years ago

FTR, a mediatype of the form application/star+ld+json will most probably be rejected by IETF. The Verifiable Credentials WG inquired about it recently.

IIRC the grammar of media-types allows for several +s, but the normative text describing the "inheritance" relationship between a/b and a/c+b is written under the assumption that there will be only one +. So although the IETF people agreed that this should be clarified, they didn't want to make that leap at the time.

TallTed commented 3 years ago

@pchampin -- I think your IETF media type info is outdated.

The DID WG is pursuing registration of application/did+ld+json, in parallel with development of a new RFC clarifying the general case of MIME types with multiple + (see the latest RFC submission to IETF and current draft) which would aide in multiple pending media type registrations (including both application/star+ld+json and application/did+dag+cbor).

pchampin commented 3 years ago

This was discussed during today's call https://w3c.github.io/rdf-star/Minutes/2021-01-15.html#item03 (end of the discussion)

gkellogg commented 3 years ago

For some perspective on the long-term association of MIME types and changing formats, consider text/html and text/css, and others. Both specs have evolved significantly since introduced, and a 2000's era HTML client would not be able to properly parse HTML5, much less interpret, without knowledge of the tag change over time. Even the announcement DOCTYPE has been deprecated.

I see Turtle* etc. as a logical evolution of the RDF formats, and the principle of follow-your-nose (once part of an RDF REC) would hold. Of course, this is a CG publication, and can't have formal weight, but early versions of HTML5 would have still used text/html prior to standardization.

I repeat my suggestion that a profile parameter (if anything) is most appropriate.

ericprud commented 3 years ago

@gkellogg , I'm not sure that the HTML precedent applies because, while HTML has evolved enormously, any step change which went from something an SGML parser could consume to something it couldn't (e.g. dropping DOCTYPE) didn't occur until long after SGML/HTML parsers were obsolete. The high cost of changing media types (c.f. x-www-form-urlencoded) forced HTML evolution to favor breaking backward compatibility and the consequence was an enormous amount of resources poured into release engineering because it was so chaotic. I can't speak to the CSS precedent 'cause I'm not sure whether e.g. @media selectors broke existing CSS parsers.

With Turtle et al, I don't think this community can afford the same resource investment that doctcoms made in HTML. I think tactically, it's better to invent an explicitly incompatible media type (i.e. embedded) so that

Existing toolchains continue to work.
People can confidently publish e.g. TurtleStar data.

Unlike the cost of supplementing HTML media types, I think the cost of duplicating the RDF media types is almost purely aesthetic.

afs commented 3 years ago

Existing toolchains continue to work.

+1

Is there an example of a MIME type which evolved over time so there is a MIME type for "v1" and a MIME type for "v2"? How different were the versions?

(Turtle itself changed between initial registration 2007 and W3C REC 2014)

pchampin commented 3 years ago

The way I see it:

if a server publishes RDF* with a new mime-type that an old (plain RDF) client does not understand, the client will simply refuse to load the content, and inform the user that something went wrong;
if that server publishes RDF* with an old mime-type, the old client will try to parse it, fail at some point, and will inform the user that something went wrong. What do we lose?

Granted, in the second case, the client may also crash miserably (but would that be advisable anyway, regardless of RDF*?), or consume a lot of resources before it realises that it can not parse the whole content.

However, and I think that is @afs' point, the second option buys us a smooth transition for all RDF* content that happen to also be plain RDF (i.e. not containing embedded triples).

VladimirAlexiev commented 3 years ago

The idea is that any "star" format will be a superset of the respective traditional format. So any traditional content is also valid "star" content. But not vice versa: if "star" features are used, then the content is not backward compatible.

@afs: we are making a "least bad" choice

My oh my, I didn't realize all these complications exist :-( After reading the above, I completely agree with Andy's sentiment.

Andy makes strong points that a server may not know what content it holds or what result it returns, or it may be too expensive to determine the type of result precisely. So that's an argument FOR keeping only the traditional formats.

Now let's examine the CONS, i.e. arguments for introducing "star" formats. (Out of @ericprud's classification, I think the "profile" or "embed" styles of expressing the formats can fill the bill.)

@pchampin: "What do we lose?.. the client may also crash miserably, or consume a lot of resources before it realises that it can not parse the whole content". I think that's a considerable loss.
@HolgerKnublauch in #55 "An approach with using long URIs has a natural serialization in vanilla TTL". I.e. turtle-star can be expressed in plain turtle (although a lot more verbosely). To let the server return turtle (to old clients) vs turtle-star (to new clients), we need to distinguish between the two formats.
I assume that the majority of RDF data will remain in the traditional formats. When I edit a file that uses "star" features, I want to tell my tools that it uses a new format, so it's properly validated.

I think here's a good compromise:

Introduce "star" MIME types and explicitly state that traditional formats are "forward-compatible" with them but the new formats are not backward-compatible with the old.
Serve the new formats if the server has reason to believe the new formats are needed: a static file the server knows is "star" file due to its extension, or the result of a query that uses new "star" features, and the client has requested the new format, after conneg.
If the server can convert "star" to traditional formats by "using long URIs": convert and serve the old formats if the client has requested the old format
Serve the old format if the server has no info whether the data uses "star" features (eg the result of a wildcard ?s ?p ?o query), and the client has not requested the new format. This may lead to the loss 1 described above

Sorry, this is very imprecise, and a bit TL;DR. In brief, I believe that:

There's enough reasons to introduce new formats
The server should serve them on a best-effort-to-determine basis, and we should document that's not always possible (Andy's points)

pchampin commented 3 years ago

@VladimirAlexiev

(Out of @ericprud's classification, I think the "profile" or "embed" styles of expressing the formats can fill the bill.)

We had a straw-poll during our last call and there seem to be a general agreement to keep the old mime-types, but augment them with a profile or another parameter.

ericprud commented 3 years ago

Two use case that profile does not address:

someone wants to publish a turtle-star for star-enabled clients and conventional turtle for the rest. A conventional client would not know how to supply or interpret the profile parameter. We could say that turtle-star media types MUST have a profile, but we'd be stuck with that forever.
a conventional client (e.g. Web Protege) has appropriate error handling and pilot feedback if it receives a 406 Not Acceptable. This code path would not be invoked if parser simply chokes while parsing an embedded triple.

dbooth-boston commented 3 years ago

Not entirely, the profile http://www.w3.org/ns/json-ld#framed (see IANA Considerations can be applied to application/ld+json for a frame document, which is an extension of JSON-LD allowing things like @embed, which are not otherwise allowed, so there is precedent for doing this.

@gkellogg, that "precedent" sounds to me like a violation of the whole idea of a profile, which is that the profile language is a subset of the original language: every legal sentence in the profile language should be a legal sentence in the original language.

In fact, the Iana Considerations section of JSON-LD 1.1 https://www.w3.org/TR/json-ld11-framing/#iana-considerations specifically says: "A profile does not change the semantics of the resource representation when processed without profile knowledge, so that clients both with and without knowledge of a profiled resource can safely use the same representation." That clearly seems to be violated if the profile is not parsable by a client. Even though JSON-LD 1.1 chose this approach, I don't think that is a good reason to do it again.

I am not seeing a good justification for serving an RDF-star document with a MIME type for which it cannot be parsed. The value of having a MIME type is diminished if it does not accurately describe the actual content.

gkellogg commented 3 years ago

@gkellogg, that "precedent" sounds to me like a violation of the whole idea of a profile, which is that the profile language is a subset of the original language: every legal sentence in the profile language should be a legal sentence in the original language.

In fact, the Iana Considerations section of JSON-LD 1.1 https://www.w3.org/TR/json-ld11-framing/#iana-considerations specifically says: "A profile does not change the semantics of the resource representation when processed without profile knowledge, so that clients both with and without knowledge of a profiled resource can safely use the same representation." That clearly seems to be violated if the profile is not parsable by a client. Even though JSON-LD 1.1 chose this approach, I don't think that is a good reason to do it again.

It says that the frame document is processed the same with, or without the profile parameter. The frame document is processed by a JSON-LD processor when provided as the "frame" for transforming an input document. The framing algorithm will operate the same with or without any profile parameter. If a frame document were processed as the input for an algorithm such as Expansion or Compaction, or even Framing, a processor could complain about the presence of keywords such as @embed, but that's not the intended use of a framing document.

In any case, that was the decision of the WG and goes back to the 1.0 version of Framing (which wasn't a REC). Perhaps it was made in error, but certainly at the time, something like application/framing+ld+json was not available and a non-JSON-LD mime-type would also not seem appropriate. In retrospect, the framing keywords (such as @embed could be considered as part of JSON-LD proper, and it's only in the context of use that they become appropriate or inappropriate.

TallTed commented 3 years ago

@gkellogg

It says that the frame document is processed the same with, or without the profile parameter. [etc.]

I think @dbooth-boston's point was less about JSON-LD Framing, and more about Turtle vs Turtle-star, and this has come up before. To wit:

The Turtle-star media type doesn't make sense as a profile of the Turtle media type, because Turtle-star is not a subset of Turtle.

I agree strongly with this, and similar applies to the rest of the RDF-star serializations and their media types.

Put simply: Profiles on (standard) RDF media types won't work for media types of RDF-star serializations.

Parameters (which were the other option in the air for the non-binding straw poll) somehow being attached to (standard) RDF media types might work for future tools which understand RDF_star, but parameters are not part of the IANA media type registration at all, and there's no standard way to communicate them.

One major expressed concern about having new types has been that an RDF-star server might not know whether it was going to deliver RDF-star data until late in a query response where it was delivering (standard) RDF data in response to a request for such (standard) RDF media type.

I think the only answer we can have to this is, if a request specifies Turtle or other non-RDF-star media type, the server must then either commit to (and follow through on) delivering that, or reject the request. RDF-star data must only be delivered in RDF-star serializations, with appropriate media types. If an RDF-star server cannot commit to such a delivery -- such is life.

pchampin commented 3 years ago

It seems to me that there are two related but distinct questions here: 1) assuming we were a new RDF WG, would we like to upgrade text/turtle to support new features (as opposed to introducing a new media-type)? 2) considering we are not a WG, is it OK for us to propose using text/turtle until a future WG makes a decision?

I am not clear whether people in favor of a new media-type are of this opinion because they answer "no" to the 1st question (making the 2nd moot), or because they answer "yes" to the 1st, but still "no" to the 2nd.

Note that text/turtle has already been extended in the past (SPARQL-like prefixes) in a way that would break old clients. application/rdf+xml has also been extended between the 1999 and the 2004 recommendations. So I believe that we can legitimately answer "yes" to the 1st question (although of course, other elements could lead us to answer "no").

ericprud commented 3 years ago

assuming we were a new RDF WG, would we like to upgrade text/turtle to support new features (as opposed to introducing a new media-type)?

considering we are not a WG, is it OK for us to propose using text/turtle until a future WG makes a decision?

Good point, separating those will help us distinguish between where we want to end up and what we are comfortable doing now.

Note that text/turtle has already been extended in the past (SPARQL-like prefixes) in a way that would break old clients. application/rdf+xml has also been extended between the 1999 and the 2004 recommendations. So I believe that we can legitimately answer "yes" to the 1st question (although of course, other elements could lead us to answer "no").

True: the 2011-03-28 text/turtle registration did not include SPARQL PrEfIx and bAsE; those were added by the RDF 1.1 WG in 2014. Also, Dave Beckett's spec didn't allow leading numbers in local names. I don't recall how RDF/XML was liberalized (can't have been the same 'cause XML namespaces owns localNames) but I totally believe it was.

The RDF 1.1 changes are arguably different in kind from the change proposed here. They liberalized syntax, which gave content producers latitude to produce more readable documents, at the expense of breaking deployed parsers. If the consumer upgraded their parser, they'd be able to consume the data without further changes to their infrastructure.

The current change extends the model, which has implications thoughout the consuming toolchain. Replacing a parser won't allow you to stick <Lois> <believes> << <Superman> <can> <fly> >> . into an existing triple store, HDT file, translation to SQL, etc.

afs commented 3 years ago

I believe that the long term ideal is one RDF, not RDF with a separate extension. We should aim for the ideal outcome, then document the migration issues.

To introduce MIME types is, in effect, to make a permanent distinction if RDF-star as a separate extension. A MIME is permanent/very-long-time (c.f. application/x-www-form-urlencoded). Documentation to say "valid for" can't really cause software to adapt to a schedule so it can't be switched off.

It is more than a pair-wise migration. There are three parties: application, client library, server. We are beginning to see more federation so the there is more complexity on the web.

Undoing a MIME type is getting harder!

We could propose a file extension to indicate the use of RDF-star in downloadable data file. This would be a useful indication when retrieving dumps and more likely human-in-the-loop. Also, a mapping to/for reification so adapting to un-upgraded software (HDT example) can be delegated rather than built-in to the proposed changes.

My concern is for "old application, new server" situations where new MIME types may interfere with existing applications even when the applications are asking for data they previous successfully accessed. I believe breaking what works is more damaging and harder on support.

dbooth-boston commented 3 years ago

I believe that the long term ideal is one RDF, not RDF with a separate extension.

Agreed, but I think RDF-star falls short for inclusion as a permanent part of RDF, because (IMO) it does not add enough functionality to justify the added complexity. Specifically:

It does not support the more general need for n-ary relations, even though (in essence) it offers a restricted form of n-ary relation.
It can only annotate one statement, even though it would seem conceptually simple to allow multiple statements to be annotated at once.

In short, it feels like RDF-star goes part way toward addressing these needs, but not all the way. That means that if RDF survives and these issues are addressed more fully in the future by some other syntax or mechanism, then we would be stuck with the remnants of RDF-star in addition to the more general solution.

But maybe I'm just wishing RDF were more like N3, for its elegance and power.

gkellogg commented 3 years ago

In my view, the purpose of this CG is to explore the space for Property-Graph relative technologies and RDF-star has fair adoption among providers, at least conceptually, and Notation-3, for all of it's great attributes, does not. But, it's up to a future WG to consider the various alternatives. You may, or may not like the approach, but it's worth making sure that the work is complete for it to be of value in the future.

I've encouraged others to describe their issues with the solution (mostly semantic), and I think that the RDF-star final report should include both majority and minority positions, so that we can reduce the time that a future Working Group would need to spend on re-hashing the arguments. So, please consider such a constructive contribution.

Regarding a mime-time, I think it would be premature to establish what would be about 10 different mime times to deal with the different serialization formats, result formats, SPARQL syntax. I would argue that if they were adopted by future WG(s), as @afs and @pchampin have suggested, they could arguably fall within the range of the existing mime types, much as other formats have changed over time without requiring new mime types.

Consider what this group is ultimately is producing is a final CG report that is intended to be considered for future standardization. Specifying too much (such as mime types and permanent named-spaced URIs) doesn't help the effort of future adoption, and if there is no future adoption, then it will be irrelevant, anyway (or the standard efforts become irrelevant to implementors, which would be even worse).

afs commented 3 years ago

@dbooth-boston In which case, the key question is whether the syntax would block. It may become legacy/redundant/archaic without causing harm. (c.f. containers.)

It does not block N3.

And, of course, <<>> isn't in the predicate position.

dbooth-boston commented 3 years ago

the key question is whether the syntax would block

Good point. Maybe I am being overly concerned.

pchampin commented 3 years ago

this was discussed during today's call : https://w3c.github.io/rdf-star/Minutes/2021-04-23.html#t03

TallTed commented 3 years ago

@ericprud -- See relevant minutes section... Any comment about text/turtle registration update, to at least point to RDF 1.1 Turtle instead of the long-deprecated Team Submission, and/or thoughts about Turtle-star needing (or not) a new media type?

gkellogg commented 3 years ago

@ericprud -- See relevant minutes section... Any comment about text/turtle registration update, to at least point to RDF 1.1 Turtle instead of the long-deprecated Team Submission, and/or thoughts about Turtle-star needing (or not) a new media type?

Of course, the Turtle 1.1 spec does have an IANA Section "Internet Media Type, File Extension and Macintosh File Type". But, perhaps it was never actually sent to IANA? Probably not too late to send that in.

ericprud commented 3 years ago

@ericprud -- See relevant minutes section... Any comment about text/turtle registration update, to at least point to RDF 1.1 Turtle instead of the long-deprecated Team Submission, and/or thoughts about Turtle-star needing (or not) a new media type?

Of course, the Turtle 1.1 spec does have an IANA Section "Internet Media Type, File Extension and Macintosh File Type". But, perhaps it was never actually sent to IANA? Probably not too late to send that in.

Given that the last media type I registered took just shy of six months, this might take a while and run afoul of a great idea or collective resignation (aka "consensus") we arrive at in the next few months. I'd be tempted to hold off on that update unless we think that the risk that we stumble across a better plan is low. (It's also possible that such an update would take a day. The bottle neck is the IANA expert reviewers.)

ericprud commented 3 years ago

Here's an LTS (let's update with new scenarios) list of relevant axes for a common HTTP GET scenario:

Emitter Use Case Axes:

[ ] (hasemb) requested content has embTriples
[ ] (seremb) can serialize embTriples
[ ] (recreqmt) recognizes requested embTriples media type
[ ] (recreqprof) recognizes requested embTriples profile
[ ] (altold) has alternative without embTriples
[ ] (altstar) has alternative with embTriples

link to e.g. #user-content-server-hasemb

Requester Use Case Axes:

[ ] (parseemb) can parse embTriples
[ ]
(storeemb) can store embTriples (proxy for "Can the toolchain work with embTripels?")
- yes
- [ ] (reqmt) knows to request embTriples media type
- [ ] (reqprof) knows to request embTriples profile
- [ ] (recrespmt) recognizes response embTriples media type
- [ ] (recrespprof) recognizes response embTriples profile
- no
- [ ] (reqmtold) knows to request media type for no embTriples
- [ ] (reqprofold) knows to request profile for no embTriples
[ ] (altold) prefers alternative without embTriples
[ ] (altstar) prefers alternative with embTriples

link to e.g. #user-content-client-parseemb

Proxy behavior is unlikely to be an issue; we can always add what we need to a Vary header. PUT and PATCH are basically a forced GET with the roles of client and server reversed. In short, I think that GET axes on client and server should enable most of the necessary analysis.

exploration

This is a large search space; will iterate with updates to axes.

requester	emiter	client receives	how good is that?
reqmt	hasemb, seremb, recreqmt	requested embTriples payload	1.0
reqprof	hasemb, seremb, recreqprof	requested embTriples payload	1.0
reqmt	!recreqmt	406 Not Acceptable or requested old payload (depends on Accept)	1.0
reqprof	!recreqprof	requested old payload	.5 (less client control)
!reqmt (naive client)	recreqmt, altold	406 Not Acceptable or requested old payload (depends on Accept)	1.0
!reqprof (naieve client)	recreqprof	requested old payload	.5 (less client control)

TODO (may, but here are a couple):

naive client issues traditional GET on doc with embTriples and old alternatives.

Should "old" be called "flat"? Anyways, here's were I've gotten so far.

pchampin commented 3 years ago

this was discussed during today's call https://w3c.github.io/rdf-star/Minutes/2021-05-21.html#t06

w3c / rdf-star

New mime types for RDF-star serializations (inc. SPARQL results) #43

Proposals

Emitter Use Case Axes:

Requester Use Case Axes:

exploration