Closed fge closed 6 years ago
I'd personally prefer requiring the more expressive one, JSON Patch by default (and either also requiring JSON Merge Patch or making it optional). Even though JSON Patch is more cumbersome, null
is, imo, not that obscure of a feature to forego if support is being added for one or the other.
You'll have to clarify what you mean by "And again" because this is an entirely new issue in the tracker. Can you provide some use-cases of what problems this solves?
@Acubed I have mentioned it countless times on the forums, but anyway, here it goes again...
Schema at uri1
reads:
{
"description": "base schema",
"type": "object",
"properties": { "p1": { "type": "string" } },
"additionalProperties": false
}
Schema at uri2
wants to extend that schema so that p1
has a minimum length of 4, and add a property p2
of type integer:
{
"$patch": {
"source": { "$ref": "uri1" },
"with": [
{ "op": "add", "path": "/properties/p1/minLength", "value": 4 },
{ "op": "add", "path": "/properties/p2", "value": { "type": "integer" } }
]
}
}
That is with JSON Patch.
With JSON Merge Patch that would be:
{
"$merge": {
"source": { "$ref": "uri1" },
"with": {
"properties": {
"p1": { "minLength": 4 },
"p2": { "type": "integer" }
}
}
}
}
This means there is no need for the hacks of "limiting only to defined properties" etc.
Unambiguous and it Just Works(tm).
And yes, I said "again" since I have proposed this idea for more than a year.
@fge Please understand it needs to get posted here, too, for the record. Thanks!
Note that with JSON Merge Patch, null
is used to indicate deletion of a value, but there is no mechanism in that specification for adding the value null
. So while JSON Merge Patch is a nice option to have around since it is more compact and easier to see at a glance what the changes are, it has this deficiency regarding null
values.
@brettz9 yes, I am fully aware of it; however, no JSON Schema keyword currently allows null as a value, so this remains a viable option.
Maybe this will not be the case in v5 though.
"type": "null"
?
@epoberezkin that is the JSON String "null"
, not JSON null
Ah. I now understand what you mean - there is no null in JSON-schema itself indeed.
@fge "enum" does permit null, doesn't it?
null
can be used in enums, but it is ok for $merge, as it replaces the whole array. Only constant
will be the issue, but it's an edge-case really.
@fge, just to clarify. The result of both $merge and $patch is the patched schema, there is no assumption that the original schema gets modified, correct?
So I can create schemas like this:
{
"id": "my-meta-schema.json",
"$merge": {
"source": "http://json-schema.org/draft-04/schema#",
"with": {
"properties": {
"myCustomKeyword": { "type": "string" }
}
}
}
}
I agree with @fge that $merge covers literally all real use cases. It doesn't allow removing nulls and adding/removing array items, but I think we can live without it. Drop $patch maybe?
@epoberezkin yes, the result is the patched schema; the original schema is not touched in any way, shape or form.
Of course, this means requiring a preprocessing of some sort, but then so does $ref
, and it still retains priority.
With the current standard it's relatively easy to extend a schema that is not recursive without merge/patch - allOf does good enough job.
But for recursive schemas this approach simply doesn't work, so there is no mechanism to extend meta-schemas and any other recursive schemas, which is a shame... So $merge is really important.
@erosb only as a member in an array.
But then JSON Merge Patch doesn't allow modifications of array elements individually. If you specify an array, the whole array is replaced.
JSON Patch on the other hand allows patching of arrays without a problem.
@epoberezkin that is not correct. Your schema would be:
{
"id": "my-meta-schema.json",
"$merge": {
"source": { "$ref": "http://json-schema.org/draft-04/schema#" },
"with": {
"properties": {
"myCustomKeyword": { "type": "string" }
}
}
}
}
The argument to source
is a JSON Schema, not a URI to a JSON Schema.
Got it. Yes, makes sense. This way you can both patch some third-party schema and also use the same patch to patch multiple schemas without the need to put them in separate files.
FWIW, JSON Patch also has copy
and move
, especially useful when one wishes to keep in sync with the referenced document, however it changes... JSON Merge Patch not only can't do this but it is ambiguous as to whether a change was a result of a move or just a deletion+addition
(And incidentally, if there is ever a JSON Merge Patch 2.0, I hope it can avoid the null
problem by using something like "\u0000"
in place of null
(and then require strings to begin with an actual NUL to be preceded by one extra NUL) as well as to allow modification of strings by allowing for inclusion of the previous string, e.g., "Prepending some text to the old value: ${current}"
, with \${
allowing the literal.)
@brettz9 (Or just introduce a "delete" operation.)
@Acubed: JSON Patch has the "remove" operation, but I'm speaking of JSON Merge Patch, which doesn't have--and wouldn't work with--operations per se--it just mimics the appearance (of the altered portions) of the document (as in @fge's example)--with the exception of null
which it overrides. Overriding a specific string (and providing an escape option for it, as I mentioned) would allow one to avoid the ambiguity.
The following is my ideal JSON patch/merge approach which I'll call "Power Patch". It has the full expressivity of JSON Patch, but in a more succinct form, and it can optionally include JSON Merge Patch patches and/or "Whole document" patches (explained below).
The paths mentioned below are, as in JSON Patch, JSON Pointers.
value
can be any JSON value.
Order is significant, even for the keys of an object (which, incidentally, ECMAScript now assures).
mergePatchValue
would behave as patch objects in JSON Merge Patch, with the exception (discussed earlier) that null
could be used as an actual value to add to a document, and a single "\u0000"
would trigger deletions instead (with other strings starting with NUL requiring an additional NUL to be added at the beginning). The advantage of this approach is that it allows a clear indication of where modifications are being made without needing to redefine the whole document and without needing to specify the operations (though limited to JSON Patch's add
, replace
, and remove
).
docValue
would be shaped similarly to mergePatchValue
, but instead of using NUL for deletions, this object would cause any properties which had been present before and which were not specified now to be deleted. Although this might technically not be what we think of as a patch, its value lies in ensuring that each snapshot reflects the whole structure at a given moment, and without explicitly needing to delete unused items. And with its integration into "Power Patch", users have the option of switching to or from this approach as desired.
The rest of the properties (which are inspired by, and expanding on, JSON Patch), offer the advantage of allowing for moving, copying, and testing. Unlike the other approaches, by looking at any given patch, you are able to tell what parts were assumed to be already present (the paths) and what parts are new (the values). In order to allow the JSON Patch properties to work with less redundancy (as in JSON Merge Patch or the "Whole document" method), I've added basePaths
which is an object whose key is a base path and whose value is itself a Power Patch object (which can therefore have its own "basePaths" for nesting paths further).
The following is a single patch showing all properties, though each property is optional. These patches could also be put within an array (or an object whose keys are versions) to indicate a sequence of patches to be applied.
{
merge: mergePatchValue,
whole: docValue,
basePaths: {
path1: {
// Operations here would use paths (e.g., path2) relative to the base path (i.e., path1)
add: {
path2: value
}
basePaths: {
path2: {
replace: {
path3: value
}
}
}
}
},
test: {
"path1": value
},
add: {
"path1": value
},
replace: {
"path1": value
},
move: {
"fromPath1": "destinationPath1"
},
copy: {
"fromPath1": "destinationPath1"
},
remove: ["path1"]
}
The only disadvantage I see to this approach (as compared to JSON Patch) is that if you want to allow some kind of custom options to a given operation, you'd have to add them as a property outside of the operation object (e.g., copyOptions
). But I think the succinctness is worth it.
One other consideration: as pertains to JSON Schema and if this format were to also be used for versioning (especially for offline storage like IndexedDB), move
and copy
might also be used as a signal to the application that the content itself ought to be moved or copied when there is a schema change (since the other operations would not provide enough information, e.g., whether a remove+add was deleting the schema property and its associated content or just moving it).
@brettz9 there is a reason why JSON Patch uses arrays, and your proposal demonstrates it: you cannot rely on keys order.
In your example it is requested that basePaths
be evaluated before anything else... And that won't happen!
Actually, ECMAScript is being redefined to allow for predictable iteration order: http://tc39.github.io/ecma262/#sec-ordinary-object-internal-methods-and-internal-slots-ownpropertykeys . (And FWIW, in my example, it is not basePaths
alone that needs to work first--that is just an optional item, but yes, iteration order does need to be reliable, but thankfully, it appears that with IE no longer a sticking point, iteration order is becoming reliable again...)
@brettz9 ECMAScript maybe. JSON? Won't happen, for backwards compatibility reasons.
Yeah, thanks, @fge, I do see the JSON spec mentions it being "unordered".
(And FWIW, I may have mistakenly repeated this comment since for-in order states at https://tc39.github.io/ecma262/#sec-enumerate-object-properties (which is indicated by https://tc39.github.io/ecma262/#sec-runtime-semantics-forin-div-ofheadevaluation-tdznames-expr-iterationkind ) that , "The mechanics and order of enumerating the properties is not specified but must conform to the rules specified below" and the only rule relevant to OwnPropertyKeys
is "EnumerateObjectProperties must obtain the own property keys of the target object by calling its [[OwnPropertyKeys]] internal method"--which doesn't suggest for-in must iterate in the order of those keys. However, one could apparently use Object.getOwnPropertyNames
safely as that refers to https://tc39.github.io/ecma262/#sec-getownpropertykeys which relies on OwnPropertyKeys
which does abide by the first-in order, more or less.)
I hope there will be a JSON 2.0 which fixes this and the separator problem.
But regardless, the format proposed here can still be readily modified to be JSON-friendly (and even preferable to my earlier version since it allows multiple operations of the same type at the same level, e.g., there could be one add
at the beginning and one at the end--and it also allows easier access to the operation type, and also if custom options were needed, they could be added as a third argument):
[
["basePaths", [
["path1", [
// Operations here would use paths (e.g., path2) relative to the base path (i.e., path1)
["add", [
["path2", value]
]],
["basePaths", [
["path2", [
["replace", [
["path3", value]
]]
]]
]]
]]
]],
["test", [
["/path1", value]
]],
["add", [
["/path1", value]
]],
["replace", [
["/path1", value]
]],
["move", [
["/fromPath1", "/destinationPath1"]
]],
["copy", [
["/fromPath1", "/destinationPath1"]
]],
["remove", ["/path1"]]
]
@brettz9 but why?
JSON Patch is an RFC, and implementations exist. Why go to the trouble of defining another alternative?
Because it is much less ugly and verbose, especially without basePaths...I also like the hybrid option since one can use the pseudo-JSON Patch for move and copy, and the improved JSON Merge Patch for everything else.
@brettz9 beauty is in the eye of the beholder... I myself find JSON Patch to be terse, short and to the point, and your proposal to be needlessly complicated :p
Beauty is in the eye of the beholder (and the above could use a sequence of objects within the array instead if preferred) but terseness is not... Ignoring the merge/whole options, here's what the equivalent in JSON Patch would require:
[
{"op": "add", "path": "/path1/path2", "value": value},
{"op": "replace", "path": "/path1/path2/path3", "value": value},
{"op": "test", "path": "/path1", "value": value},
{"op": "add", "path": "/path1", "value": value},
{"op": "replace", "path": "/path1", "value": value},
{"op": "move", "from": "fromPath1", "path": "/destinationPath1"},
{"op": "copy", "from": "fromPath1", "path": "/destinationPath1"},
{"op": "remove", "path": "/path1"}
]
Using value
as "value", the stringified length is 405, whereas with my version, it is only 323. And that is using basePaths
in an example where the paths aren't long, and also not taking advantage of the fact that each operation can have child operations which my example only hinted at which would avoid further duplication.
@brettz9 maybe, but the JSON Patch is imnsho more readable.
Also, your paths are wrong. JSON Pointers are always absolute.
Ah yes, thank you, fixed, including the counts (I didn't change basePaths
since it is designed to avoid the need for absolute paths.)
FWIW, it could instead be expressed as the following (also length 323):
[
{basePaths: [
{path1: [
// Operations here would use paths (e.g., path2) relative to the base path (i.e., path1)
{add: [
{path2: value}
]},
{basePaths: [
{path2: [
{replace: [
{path3: value}
]}
]}
]}
]}
]},
{test: [
{"/path1": value}
]},
{add: [
{"/path1": value}
]},
{replace: [
{"/path1": value}
]},
{move: [
{"/fromPath1": "/destinationPath1"}
]},
{copy: [
{"/fromPath1": "/destinationPath1"}
]},
{remove: ["/path1"]}
]
or if one did not need child arrays (length 303):
[
{basePaths: [
{path1: [
// Operations here would use paths (e.g., path2) relative to the base path (i.e., path1)
{add: {path2: value}},
{basePaths: {
path2: {
replace: {path3: value}
}
}}
]}
]},
{test: {"/path1": value}},
{add: {"/path1": value}},
{replace: {"/path1": value}},
{move: {"/fromPath1": "/destinationPath1"}},
{copy: {"/fromPath1": "/destinationPath1"}},
{remove: "/path1"}
]
Note that basePaths
is not required--I'm just trying to show the full expressivity here. I think the latter part of the latter example ought to be clear in its purpose--and I think superiority--without all the extra cruft.
Inventing new syntax is bad idea. We should stick to relying on existing RFCs.
I'd like to derail this conversation a bit, and point out, just because we can do something, doesnt mean we should. Ive seen a couple of examples to support patch and merge, all are based on the fact that a schema author closed their schema to additions. Shouldnt we respect that? If we want to extend a closed schema why dont we copy the schema, make your edits, then rename it and publish it under a new uri? We dont add a crap load of complexity to our validators. Keeping them fast and free from bloat. Yes I understand that programmers are inherently lazy SOB's. Ill put my habd up for that category. Thats why I learned to program in the first place. I wanted to automate tasks that I didnt want to do. So instead of modifying the spec to add/merge/delete/modify, why not create a toolset?
And as an aside, its interesting that people are referencing RFC's that build apon JSONSchema, but the schema itself is not an RFC. Its an internet-draft, and one that has expired at that. Unless I am completely mistaken, which if I am, the spec needs to be updated to reflect this.
@robbie-mac : the problem as I see it is that sometimes we want to stay in sync with an existing schema. It's not just about initial effort, but ongoing maintenance...
@brettz9 A good point. However I agree with @JanTvrdik , I don't think this has a place in this spec, specifically because this spec defines a data format for describing other data. Its not, nor should it, include methodology for manipulating that data. Besides, if either the merge or patch specs get approved, then you can use them on the data. You would just have to validate your source json objects by their respective schema. As it sits now, I think that a simple change to the spec, with huge implications to validators, is to modify the definition of $schema, instead of a single string that is a uri, make it an array of strings that are uri. That way you can contain a chain of schema, the last (or first) entry is the terminus by a schema that is self describing. Also, change the application of references to be able to reference ANY of the schema in that chain.
Now you can have a listing for the progression of changes. One that can be $ref'd for different values.
@fge I implemented $merge and $patch in Ajv validator. Thanks.
This keyword is a bit more complex to use and to implement than it seems on the surface because of the $ref resolution in source and in the patch (in $merge keyword the patch can be a schema).
I'd like to derail this conversation a bit, and point out, just because we can do something, doesnt mean we should. Ive seen a couple of examples to support patch and merge, all are based on the fact that a schema author closed their schema to additions. Shouldnt we respect that? If we want to extend a closed schema why dont we copy the schema, make your edits, then rename it and publish it under a new uri?
My initial feelings lean entirely this way as well.
the problem as I see it is that sometimes we want to stay in sync with an existing schema. It's not just about initial effort, but ongoing maintenance...
Interesting. I think if that can be addressed outside of JSON schema that would be best. Perhaps one could set up a server that watches a particular schema at one URI, patches it, and then serves it at another URI? Would that cover your use case @brettz9?
If it doesn't make it into JSON Schema, @seagreen , I would probably be more interested in a client-side library that does the resolving/patching first based on my own custom merge/patch meta-language (like JSON References) with the remote schema having been made available to the client via a git module or npm bringing the remote schema to my server. I prefer to avoid server scripts when possible.
I'd like to see some use cases where (1) copying an existing schema and modifying that is preferable, and (2) where this proposal is preferable to copy-and-modify
The first set you presented, @awwright , is fine for:
The second set is far preferable for:
I think using $merge/$patch is not only helping to maintain the schema - it allows to easily see what is the difference between the source and the resulting schema. If, for example, I need to extend a meta-schema to include schemas for some custom properties, they will be clearly visible with $merge/$patch. Same if I add/remove requirements that I want to impose on the schemas.
Programs must be written for people to read, and only incidentally for machines to execute.
[I'm coming back to JSON Schema after some time away from working with it, so I apologize if this comment misses something from the last year and a half or so or just exposes how rusty I am with this stuff]
A good use of $merge is with a field like readOnly, where combining multiple values using allOf doesn't make sense. What would it mean to have an allOf where readOnly is false on one side and true on the other? Using $merge (or $patch) you can easily express "just like this other thing except you can't modify it."
Another useful situation is with a re-usable type that has a serviceable default value for "description", but in many cases that description should be overridden with a description specific to the use of the type. Using allOf for this results in multiple descriptions- there are ways around the problem, but none of them are as clear and readable as $merge.
@epoberezkin:
I think using $merge/$patch is not only helping to maintain the schema - it allows to easily see what is the difference between the source and the resulting schema.
Great point.
"Programs must be written for people to read, and only incidentally for machines to execute."
Ehrrrm. I think correctness of a specification is more important than the readability of programs using the specification. We're still trying to figure out how $ref
works in JSON Schema draft 4 (take a look through the issues and PRs here), so adding more features that will interact with $ref
scares me. This doesn't mean that we shouldn't (the advantages could outweigh the disadvantages), just trying to explain my feeling here.
@handrews:
Another useful situation is with a re-usable type that has a serviceable default value for "description", but in many cases that description should be overridden with a description specific to the use of the type. Using allOf for this results in multiple descriptions- there are ways around the problem, but none of them are as clear and readable as $merge.
Nobody's saying you shouldn't have $merge
, just that it shouldn't be part of the JSON Schema spec. I obviously agree that $merge
is really useful. At the same time it adds no actual new validation capabilities. (This is different than $ref
, which lets us express infinitely recursive schemas, which couldn't be done without $ref
).
You guys are making great arguments for the expressive power of $merge
and $patch
though. I feel like the desire to express "this URL is just like this other one, except for $patch
such-and-such" will be a commonly occurring situation even outside of JSON Schema. Would it be possible to address that with a separate protocol? Then it wouldn't have to be custom baked into many different specifications. (Also I think in general we're bad at expressing "this information can be purely derived from that information" and would love to hear more thoughts on this. It's an important problem)
My only real concern about how the specifications are arranged or combined is that if the json-schema spec reads in a way that causes implementors to generally not support $merge and $patch, that's a big usability problem.
For interoperability, it is good for json-schema to indicate preferred preprocessing mechanisms. Otherwise we'll have a proliferation of options and tools.
What we need is a guarantee that, given a set of tools that follows the json-schema specification, we can be sure that any site that also follows the json-schema specification will work with those tools. If essential pre-processing is left out, then there is an extra API-specific step of figuring out which preprocessing you need to do. That will be a barrier to broad, consistent adoption.
It won't matter how well json-schema validates anything if it's too awkward to use as specified, and too non-standard to use effectively with additional specifications that make it usable.
Ideally, $merge and $patch could (like $ref) be their own RFC (or RFCs), with json-schema specifying that a conforming implementation should support them, and implement pre-processing and lazy evaluation in such-and-such a manner.
My feelings on $merge and usability come from my experience as one of the architects of Riverbed Technology's APIs (although I left Riverbed about a year and a half ago and am speaking only for myself here). You can see $merge used extensively in the service definitions for the SteelHead and SteelCentral Controller products.
And again, those are the two only reliable mechanisms allowing for truly extending schemas in an unambiguous fashion (yes, I love unambiguous definitions). Recall:
$merge
relies on JSON Merge Patch, aka RFC 7396;$patch
relies on JSON Patch, aka RFC 6902.Recall:
$merge
;$patch
.Why, then, define
$patch
? Simply because it allows for schema alterations which$merge
cannot do. However,$merge
answers the vast majority of cases.Recall of the rules:
$ref
still takes precedence.