w3c / rdf-concepts

https://w3c.github.io/rdf-concepts/
Other
12 stars 4 forks source link

Add the rdf:JSON datatype #14

Closed domel closed 2 months ago

domel commented 1 year ago

The rdf:JSON datatype is originally defined in Section 10.2 The rdf:JSON Datatype in JSON-LD 1.1. This issue is to determine whether to promote that definition to RDF Concepts (and RDF Schema), which would bring it in line with the definition of other datatypes in the RDF namespace.


Original text: I think that we should "legalize" rdf:JSON rdf:type rdfs:Datatype . in the same manner as rdf:HTML.

See also: https://www.w3.org/TR/json-ld11/#the-rdf-json-datatype

pfps commented 1 year ago

If there is going to be some official status for rdf:JSON then there should also be the same status for rdf:YAML.

gkellogg commented 1 year ago

I don’t think a canonical form of YAML has been specified, as there has for JSON.

pfps commented 1 year ago

Where does canonical form fit in here?

domel commented 1 year ago

I believe it's about defining lexical space, value space, and lexical-to-value mapping.

domel commented 1 year ago

I'm not against rdf:YAML, but rdf:JSON has already been included in RDF Schema (only information in the specification is missing). We can create a separate issue for YAML.

gkellogg commented 1 year ago

As defined in JSON-LD it becomes an issue for the L2V mapping. JCS had not reached a normatively CI table state at the time of production, but due to variations in representation it is important for determining equivalent values. YAML has much more potential for variation, but has no similar scheme for normalizing representations.

pfps commented 1 year ago

I don't see rdf:JSON in https://www.w3.org/TR/rdf-schema/

pfps commented 1 year ago

So maybe the datatype defined in JSON-LD needs a string canonical form. But that's only an issue for that datatype. Moreover, having the canonical form being a string when the meaning of the value is something else is not a good idea.

gkellogg commented 1 year ago

I don't see rdf:JSON in https://www.w3.org/TR/rdf-schema/

It's defined in JSON-LD 1.1, as that group did not/could not touch RDF Schema, however, it was added to the RDFS version: https://www.w3.org/1999/02/22-rdf-syntax-ns#JSON. The presumption was that the next group chartered with updating the core RDF documents would include it.

afs commented 1 year ago

Does it need a canonical mapping?

rdf:XMLLiteral has one because it always has (and is quite unexpected by users FWIW).

pfps commented 1 year ago

The presumption was indeed a presumption. I will vote against including rdf:JSON in the output of this working group.

domel commented 1 year ago

I don't see rdf:JSON in https://www.w3.org/TR/rdf-schema/

https://www.w3.org/1999/02/22-rdf-syntax-ns#

pfps commented 1 year ago

RDF datatypes need a value space and a mapping from unicode strings to that value space. This is needed to determine equality and is used in RDF entailment and (should be used) by any SPARQL processor that implements the datatype.

pfps commented 1 year ago

The definitions of RDF and RDF Schema are in their defining documents, not elsewhere.

gkellogg commented 1 year ago

I'd suggest adding an issue marker in the spec. The advice at the time was the the JSON-LD could update the RDF namespace, as it didn't make sense to tie the JSON datatype specifically to JSON-LD. Most of the discussion, including references to meeting minutes, can be found in https://github.com/w3c/json-ld-syntax/issues/4.

Objecting to add such a datatype to RDF Schema seems rather arbitrary given actual deployment and dependence in the community. Perhaps suggesting what would need to be done to make it acceptable to add rdf:JSON (and potentially rdf:YAML) to RDF 1.2 Concepts and Schema would be a way forward.

The datatype is defined in a recommendation already. Perhaps the value space of JSON could be re-defined in terms of Infra types, as is the JSON-LD Internal Representation. YAML could potentially have a value space based on its Representation Graph.

TallTed commented 1 year ago

I would suggest we treat rdf:YAML, rdf:YAML-LD, and similar things as rdf:HTML was treated in RDF 1.1 — as non-normatively specified but recognized as useful things, with links to relevant developing but not yet sufficiently finalized external specifications ... with rdf:HTML, rdf:JSON, rdf:JSON-LD, and others with now-sufficiently solidified specifications to be fully cited, etc.

@pfps — I submit that your "I will vote against including rdf:JSON in the output of this working group." is a premature declaration, and unhelpful in that it may discourage others doing relevant work which would change your mind, were it known to remain open. Given that this WG is nowhere near final results, there is every possibility that additional information and/or such work will change your decision. I ask that you hold off on such global declarations for at least the next 6-12 months, if not longer.

gkellogg commented 1 year ago

Just to save looking it up later, the decision to use the RDF namespace for the JSON datatype happened the 2019-03-22 JSON-LD WG meeting:

Ivan Herman: I have spent time on this issue with others … aside from the canonicalization problem … if we do make a native JSON type, we will have to put it into some namespace–rdf: or jsonld: Rob Sanderson: +1 to RDF namespace Ivan Herman: if we do that, we’ll have to write the SWIG mailing list, to announce the new datatype, etc. … we can do this as part of our document

The mail to SWIG was sent on 2019-12-16.

Note that it also added rdf:CompoundLiteral, rdf:language, and rdf:direction.

timothee-haudebourg commented 1 year ago

On one hand I find it already odd that rdf:HTML and rdf:XMLLiteral exist, this does not look like a core data-modelling vocabulary to me, but rather an application of this vocabulary that should be defined in its own HTML/XML vocabulary. On the other hand, this is a precedent and now that it exists, why not rdf:JSON? But still, is this a vocation of the RDF Schema to include a list of all the LD-compatible formats we can think of? What about rdf:CBOR for instance? The RDF Schema specification would need to be updated every time a new format is adopted (or is updated?). RDF Schema would be dependent on each of those format definition where I think it should be the other way around.

domel commented 1 year ago

But still, is this a vocation of the RDF Schema to include a list of all the LD-compatible formats we can think of? What about rdf:CBOR for instance? The RDF Schema specification would need to be updated every time a new format is adopted (or is updated?). RDF Schema would be dependent on each of those format definition where I think it should be the other way around.

You can always define other datatypes in different vocabs, e.g., yourPrefix:CBOR. The real life example is Custom Datatypes. rdf:JSON is slightly different because it was already defined in rdf namespace.

timothee-haudebourg commented 1 year ago

As @TallTed mentioned, I just noticed rdf:HTML and rdf:XMLLiteral are non-normatively specified. If rdf:JSON is to be included I agree it should be this way. But then I'm not sure to understand the point and the implications of having a non-normative term definition?

By the way normative and non-normative classes are defined in the same section in the RDF Schema document which I find confusing, maybe it would be better to have a "Non-Normative Classes" section.

timothee-haudebourg commented 1 year ago

You can always define other datatypes in different vocabs

Exactly, my point is that I would currently prefer HTML, XMLLiteral, JSON and the likes to be defined in their own different vocabularies to avoid adding unnecessary dependencies to RDF Schema. It's a little too late for rdf:HTML and rdf:XMLLiteral though :smile:

afs commented 1 year ago

I think they are not-normative because these datatypes are not essential to the RDF data model.

RDF XML Literals used to be normative (RDF 1.0) ... but that caused problems because they can lead to contradictions via bad literals.

Text defining the datatypes don't belong in schema either.

The datatype URI names are already in https://www.w3.org/1999/02/22-rdf-syntax-ns.

Why not have a separate document for these datatypes?

This could be thought of a an appendix to concepts (concepts says "there are some useful datypes at X"), but too large to be an inline appendix.

Then RDF concepts is kept to the point and the datatype material can be as long as it likes and further datatypes can be added ("living standard").

TallTed commented 1 year ago

@afs —

I think they are not-normative because these datatypes are not essential to the RDF data model.

I find it unhelpful to work from baseless postulations like the above, when spending a minute researching this question provides a completely different reason, which should render all conjecture based on that postulation null and void, or at least require that such conjecture be reconsidered and reformulated based on the visible fact.

RDF 1.1 Concepts and Abstract Syntax explicitly states the reason why rdf:HTML is non-normative:

RDF 1.1 Concepts and Abstract Syntax also explicitly states exactly the same reason why rdf:XMLLiteral is non-normative.

To wit (slightly edited to speak of both datatypes in one quote block) —

[rdf:HTML and rdf:XMLLiteral are] defined as non-normative because [they depend] on [DOM4], a specification that has not yet reached W3C Recommendation status.

— and the reference, accurate as of 2014-02-25, the publication date of RDF 1.1 Concepts and Abstract Syntax

[DOM4] Anne van Kesteren; Aryeh Gregor; Ms2ger; Alex Russell; Robin Berjon. W3C DOM4. 4 February 2014. W3C Last Call Working Draft. URL: http://www.w3.org/TR/dom/

afs commented 1 year ago

@Tellted - "baseless" is unnecessary. Less of the personal please.

I think they are not-normative because these datatypes are not essential to the RDF data model.

To repeat: The data model works without them.

Hence: Whether concepts defines them, defines them normative or non-normative, is a choice, not a requirement. Concepts abstract does not mention them.

Why not have a separate document for these datatypes?

which provides a way for a living standard to evolve.

TallTed commented 1 year ago

I do not think this is completed, nor do I think we have consensus within the WG on how these should be handled.

That said, I think it would be fine to clearly define the extension point which leads to rdf:HTML, rdf:JSON, rdf:XMLLiteral, etc., and either move each of these to their own document or move all of them to a single document.

...a separate document for these datatypes...

...provides a way for a living standard to evolve.

There are many living standards which evolve within a single document, so I do not think that this is a strong argument for (a) separate document(s) for these datatypes, but as I said above, I don't have a strong argument against it — which decision, either way, I think falls to the WG as a whole, should be considered via an issue, and whatever decision(s) are made via that issue should then be implemented in the document(s) via one or more PRs.

afs commented 1 year ago

My bad. I misclicked on "close" not "update".

afs commented 1 year ago

RDF XML Literals used to be normative (RDF 1.0) ... but that caused problems because they can lead to contradictions via bad literals.

In RDF 1.0, there was a way to handle ill-formed literals. The only source of ill-formed literals was rdf;XMLLiteral because it was the only normative datatype.

https://www.w3.org/TR/rdf-mt/#RDFINTERP

In RDF 1.1, there is no machinery for ill-formed literals (they aren't even mentioned anymore)

https://www.w3.org/TR/rdf11-mt/#rdf-interpretations

Some discussion around: https://lists.w3.org/Archives/Public/public-rdf-wg/2012Nov/0155.html

awwright commented 1 year ago

I'm not sure why it is that XML, HTML, and JSON should be singled out for getting datatypes in RDF. Is there a reason for limiting it to those three, instead of all or nothing?

First, why have these at all? Does this fit into the purpose of having RDF datatypes? For example, why can't HTML, XML, and JSON just be xsd:strings?

And second, if some media types need to be RDF datatypes, why not all of them? Why not just have a way to name any arbitrary media type? e.g. what if I want to encode a png image? In particular, https://www.w3.org/ns/iana/media-types/ ?

gkellogg commented 1 year ago

I believe the datatypes stem from them being native formats for different RDF serializations. In RDF/XML XmlLiteral values can be expressed in markup, same with rdf:HTML in RDFa and rdf:JSON in JSON-LD. When used in other formats, they are expressed as strings. There was some discussion if we needed an rdf:YAML datatype for a potential YAML-LD format.

domel commented 1 year ago

From F2F meeting

proposal: move the datatypes definitions (rdf:HTML, rdf:XMLLiteral) to an appendix in RDF concepts, and add rdf:JSON into that. Then update any reference to rdf:JSON

hartig commented 1 year ago

From F2F meeting

proposal: move the datatypes definitions (rdf:HTML, rdf:XMLLiteral) to an appendix in RDF concepts, and add rdf:JSON into that. Then update any reference to rdf:JSON

Implemented in PR #62