w3c / rdf-concepts

https://w3c.github.io/rdf-concepts/
Other
17 stars 2 forks source link

rdf:JSON value space too liberal #87

Closed pfps closed 1 week ago

pfps commented 5 months ago

The value space contains elements like the list that is its only member. There is nothing wrong with this, but I don't believe that such values are part of JSON and they certainly cause problems in serialization. If these values are not wanted a datatypes expert should be consulted to provide the best wording for excluding them.

My wording change would be to add "the smallest set containing" to the definition.

gkellogg commented 5 months ago

Can you elaborate on this, perhaps with some JSON examples that illustrate the problem.

Regarding "value space contains elements like the list that is its only member", if you're saying that a value can be a list with no members (e.g., []), then this is valid JSON that is intended to be representable using this datatype.

My wording change would be to add "the smallest set containing" to the definition.

Where do you think such a statement should go?

pfps commented 5 months ago

In most programming languages data structures can be self-referential. So the list that has itself as its only member is fine in these languages, e.g., LISP. The list that has itself as its only member is not the empty list. Nor is it the list that has one member that is a list that has one member that is a list that has one member that is a list etc. But all three of these are in the definition.

To remove these non-well-founded elements a common practice in some areas is to say something like "The value space is the smallest set containing strings, ...." There are many other ways of saying the same thing, and I do not know if this wording is currently acceptable in programming language or data structure specifications.

gkellogg commented 5 months ago

The notion of self-including is similar to that we describe for Triple Terms:

The definition of triple term is recursive. That is, a triple term can itself have an object component which is another triple term. However, by this definition, cycles of triple terms cannot be created.

I think the concern you're expressing about lists having themselves as a member is similar. In both cases, the grammars do not provide a way to create lists that have themselves as entries, either directly or recursively. Same for maps.

pfps commented 5 months ago

Except that triple terms are grammar elements and thus can be said to be created. rdf:JSON values just exist.

IS4Code commented 3 months ago

I don't see any practical issue there. You can objectively conclude from the lexical-to-value mapping that such anomalous values could never be encoded using rdf:JSON. A request to decode such a value using the lexical space of rdf:JSON fails. A request to encode such a value using the lexical space of rdf:JSON fails too. One way or another, you can't use rdf:JSON as a datatype for such values.

Is there a merit to having the theoretical value space bigger than what can be encoded? Perhaps:

:p rdfs:range rdf:JSON .

This means that such a property accepts not only traditional JSON strings, but potentially also other extensions that map to the value space of rdf:JSON (just through the lexical-to-value mapping of another datatype). By the way, self-referentiality is not the only issue: xsd:double has negativeZero, positiveInfinity, negativeInfinity, and notANumber, none of which are representable in JSON (but there are JSON extensions that support them).

By the way, there are datatypes whose lexical space is empty: owl:real. There is certainly usefulness to having such theoretical value spaces, precisely because you can identify lexical spaces that map into them even though the original couldn't.