Open handrews opened 7 years ago
A bit more details and alternate ideas:
"sorted": true
means that the server controls the sorting, and attempts to change the sorting by the client will be ignored. This means that when adding to the array (within JSON data, not making separate POST requests to add collection elements), the client can just append and rely on the server to ensure the correct ordering.
"sorted": false
means that the server has no awareness of ordering and will not take any action related to ordering. Such an array may not even have a stable ordering from request to request
"sorted": "/json/pointer/to/field/in/each/array/item"
means that the client can control the sorting by changing the value of that field in each item. The pointer root is the root of the item, not the complete instance. An empty pointer "" means that the entire item is the "sort key", so simply re-ordering the array is supported and sufficient. A pointer to a numeric field means that the server will re-order the array based on the value of that field in each item. While there are subtleties with this, it is a fairly common pattern.
I'm not entirely sold on my own JSON Pointer idea. The true
value could mean that the server controls the sorting but that the application can document some way for the client to influence it (hand wavey because the point is to give the application flexibility. Anyway, just some thoughts. I don't feel strongly about any of this.
Having thought about this more, I think I'm pretty solidly against the JSON Pointer idea. If you can define a single key, then sooner or later people want to define compound keys, and things get complicated very quickly. I feel like that is better handled at the application layer. The schema should at most indicate that sorting is significant.
Whether the boolean definitions I gave above are ideal or not I'm still not sure. "sorted": true
should perhaps just mean that the array is sorted when read from the document, and not that the data authority (server in hypermedia, database in some local storage arrangements, etc.) will automatically re-sort or enforce the sorting. Hmm... there are basically three options for true
:
It's worth considering how this would be used in hypermedia- for a collection with complex sorting options, would this add value? How would any of the above options work in such a case? What does "properly sorted" mean if there are multiple ways to sort it? How does one indicate an array that is often sorted but can possibly be requested unsorted (e.g. for performance reasons)? Is that case we care about?
As is often true, this simple concept is more complex to specify than it initially might appear.
I think ordered
is better understood that sorted
. To me ordered
can be any arbitrary but set sequence, while sorted
implies there is some logic behind it. As an implementation, I would want to know what that logic is if I am to add to the list.
That said, what rules could there be around modification of arrays?
Given that an array is ordered/sorted, there are two modifications (directly to the array, not necessarily to the items within it) that can be made which maintain that state: addition and removal of items. It seems removal is trivial as it doesn't change the sequence of the existing items.
As of draft-07, we have readOnly
to specify whether a client can add items. If we use ordered
, then can I add items at both or singly either end? If using sorted
it seems that some manner of algorithm specification, which would be remarkably complex, would be required.
Given that JSON natively defines an array as ordered, maybe it would be easier to use "unordered" : true
to explicitly specify an unordered set. This would also allow the default to be (more understandably) false
.
@gregsdennis I'm not a fan of double negatives (e.g. "this array is not unordered")
Perhaps an enum would be in order:
"ordering": "set"
"ordering": "vector"
and so on
@handrews while I agree, I tend to err on the side of default values ("false" being default for boolean in the vast majority of languages). While we can state that the absence of the keyword implies a value of true
, having that value contradict the default value of the value's type seems confusing.
While we can state that the absence of the keyword implies a value of
true
, having that value contradict the default value of the value's type seems confusing.
Take the following with a grain of salt, as I'm being even more opinionated than usual and none of it has anything to do with actual spec requirements
I've encountered this view before, but only from people who primarily write in strongly typed languages. It doesn't seem like a problem at all to me- the keyword being absent is represented by undefined
(JavaScript), None
(Python), etc. In C++ (which I used to work in 15+ years ago) I would probably represent schema keywords with pointers so that undefined keywords can be set to null
. This preserves the distinction between being absent and happening to have the default value (although there are other solutions).
Of course none of this is a requirement of the spec. Making a distinction between a keyword being missing and having a value corresponding to JSON null
can be tricky, but solvable. I probably would avoid defining a keyword that can be null
but defaults to something else. Then again, I try to avoid null
in JSON anyway (and no JSON Schema takes null
as a value except const
, and I'd probably handle it specially rather than allow it to complicate other keywords).
Anyway, I tend to be suspicious of any argument along the lines of "my language/library works better if you do X" in a supposedly language-neutral environment. This may have something to do with how other people have attempted to use this argument in other projects unrelated to JSON Schema :-P
Reminder: the above is my personal opinion, not entirely rational, and not actually required by the JSON Schema spec :-)
Setting "ordered": false
to override the default behavior of JSON arrays seems fine to me.
@handrews why "order": true
rather than "order": "asc"/"desc"
? Unless you want "sorted": true, "sortOrder": "desc"
...
I don't see much damage with supporting "orderBy": <json-pointer>/<array of json-pointers for compound keys>
as well.
@handrews also, is "order": false
equivalent to "not": {"order": true}
?
That is implied by the absence of default value. Because if false
means simply the absence of "order: true
, then false
should be the default.
Also, would "order: true
mean "anyOf": [{"order": "asc"}, {"order": "desc"}]
? Is such construct ever (or at least often) needed by the way?
is "order": false equivalent to "not": {"order": true}?
Since order
as a boolean would be an annotation (it tells you that the data is ordered, but does not give enough information to validate it), then it is only collected and propagated upwards if the instance is valid against the schema. So {"not": {..., "order": true}}
will never contribute an annotation (either the instance is invalid against the inner schema, in which case order
is not collected in the first place, or if it is valid against the inner schema, the not
ensures that it is invalid against the outer schema, so all annotations will be dropped at that level.
You can't really negate annotations, which makes sense- they're not a boolean outcome, even if they happen to have boolean values. That add information and the opposite of adding information is omitting information. (This bit probably deserves clarification in the spec).
As for orderBy
, that would also have to be considered an annotation, as not all data types have a clear ordering.
Honestly, I think this idea is more trouble than it's worth at this point, at least as a general annotation.
I can see it being a part of something specific like a code generation vocabulary, where it means "use an ordered data type". Or UI generation, where it might indicate that a sorting interface is possible. But I think that those use cases end up being a bit different.
@handrews I think that the root of this issue could be resolved by json-schema-org/json-schema-spec#518, since it was this case that originally prompted my question about ordering arrays.
@gregsdennis Good to know. I still have my doubts about json-schema-org/json-schema-spec#518 but definitely not shooting it down at this point.
I think there's a bit of a "slippery slope" argument to be made here. Normally I hate that argument, but looking at the discussion of ordered
so far, the simple form (a boolean) has very limited use. We've had two proposals for different directions already: "asc/desc" and using a pointer for ordered-by. But ordering is often more complicated than easily expressed in such a way.
I feel like this would end up being a not-very-useful feature that would leave people frustrated with its limitations and demanding a much more elaborate system. Which is unlikely to make everyone happy because of the breadth of possible ordering approaches.
So I think this is best left to applications rather than schema.
My current inclination is to close this and, if we find a really compelling proposal, re-open it with that. But I'll leave this issue open at least until folks have a chance to get back from vacation after new year's and catch up.
@handrews
My current inclination is to close this
I agree, but I'm also going to make a note in json-schema-org/json-schema-spec#518 about sequencing the values.
Moving out to draft-future along with the unique key proposal which has some similarities. We won't get to either in draft-08 given the current progress and focus.
Moving this to the vocabularies repo.
I have defined this in the latest release of my Array Extensions Vocabulary. An implementation is available in .Net and you can play with it on https://json-everything.net/json-schema.
The idea of a way to indicate whether array order is significant or not (basically, is this a list or a set) has been suggested numerous times. We would have use for it in the meta-schema for both the array form of
type
and presumably forenum
, which would reduce the confusion related to json-schema-org/json-schema-spec#474."ordered": true
would indicate an ordered list, but say nothing about how or why it is ordered"ordered": false
would indicate an unordered set If"ordered"
is not present, it is not known whether the array is ordered or not, and neither should be assumed. This preserves backwards compatibility, and prevents implementations from improperly handling schemas that are written with less precision.This would be an annotation, not an assertion. Implementations MAY offer validation algorithms for common ordered cases (ascending/descending numeric order for an array of numbers, for instance), but these MUST NOT be run automatically. The mechanism for turning such algorithms on or off is implementation-dependent (same as for
format
and thecontent*
keywords).