Closed handrews closed 6 years ago
I miss the point completely - why do you need a special construct to add meta keywords?
Why not just:
{
"title": "foo",
"$ref": "bar"
}
as title etc. are ignored anyway?
Or if you want you can:
{
"title": "foo",
"allOf": [ { "$ref": "bar" } ]
}
$merge/$patch and banUnknownProperties idea solve additionalProperties issue. $merge/$patch also solve schema extension issue for recursive schemas (meta-schema is just an example, recursive schemas are very common). I agree with the objections against banUnknownProperties. The argument that $merge/$patch is powerful and allows to change the schema is an argument to use it, rather than to avoid.
Leaving aside comparisons to other ideas, what problem that can't be solved already does this proposal solve?
@epoberezkin did you read the linked email thread? It explains exactly how this proposal came about and how it relates to the other issues mentioned, and why it is preferable to $merge/$patch. I'm not going to rewrite it all here, although I can copy in key points if people think it will help.
Issue #15 is the one for $merge/$patch, and it has numerous points related to this.
@tadhgpearson agreed with the need for this use case in this comment.
@seagreen and @awwright cover a lot of the concerns in comments after that.
$merge/$patch also solve schema extension issue for recursive schemas
This issue is not about recursive schemas, please keep the use cases separate. If we happen to adopt $merge/$patch for other reasons, this issue can be closed as moot, but unless that happens (and I've seen no indication that it will), this issue is only about the use case I identified here.
In another issue you suggest that I try out $merge because you think it will grow on me. That's hilarious- as you can see in #15, my previous project made use of $merge well over 200 times in the published schemas alone (and more in internal schemas). This proposal is based on the experience that despite a great deal of "additionalProperties": false
use, we never needed $merge to get around that.
@handrews I've read the thread.
It does not answer the question why annotations need re-writing and why simply adding them to $ref as in my examples is not sufficient.
I think the idea of re-writing anything follows from treating $ref as equivalent to inclusion. I think it is a big mistake to treat it in this way, because, as I wrote before:
Consider this simplified example:
{
"id": "schema1",
"allOf": [ { "$ref": "schema2#/definitions/foo" } ]
}
{
"id": "schema2",
"definitions": {
"foo": { "$ref": "#/definitions/bar" },
"bar": { "type": "string" }
}
}
That is a very common use case. If you treat "$ref" as inclusion then "#/definitions/bar"
would mean definitions in schema1
and it won't resolve.
This example is equivalent to the test case in JSON-Schema-Test-Suite - if $ref were equivalent to inclusion this test would have fail.
There are use cases when allowing inclusion would be beneficial - it would support polymorphism (same subschemas doing different validation depending on where they are included). If we think it can be beneficial then we may consider an additional keyword that does inclusion (or maybe "$merge" without "with" would do just nicely).
why simply adding them to $ref as in my examples is not sufficient.
Because the only field allowed in a $ref object is $ref. It references, it doesn't merge.
As I wrote above, you can:
{
"title": "foo",
"allOf": [ { "$ref": "bar" } ]
}
Why is it not sufficient?
There are recursive cases when inclusion is impossible (and some are mutually recursive, so it's not visible from the URI that they are recursive)
If you want to dig into the behavior of $ref, please do so in issue #66 rather than clogging up this issue with topics that are at best tangential to the proposal at hand.
If there is a specific issue where you see this as a problem with $use, please explain it in terms of $use rather than complaining about general issues with $ref. Otherwise, let's sort out $ref in #66 and then come back to this.
@epoberezkin : What part of "$ref cannot have other keys present because it doesn't merge, it just references" is unclear? I'm not trying to be snarky here, I really can't figure out how else to say it. If $ref behaved as you seem to want, it would more or less be $merge and we wouldn't be arguing over that proposal elsewhere (and pleas leave the $merge arguments in issue #15 ). And in any event, that's not how it behaves, and if you'd like to change its behavior please do that in #66 .
@handrews I was just trying to understand the need for $use as it is defined here and the only explanation I can imagine is that you equate $ref to inclusion. Sure, let's continue in #66. Do you think I need to copy this comment there?
On the substance of the proposal, it simply seems unnecessary. You already can add annotations to $refs by wrapping them into allOf, which is more concise than what is proposed.
Can you explain why the existing solution is not adequate and why an additional vocabulary is needed to do the same you can do without it?
I have a whole section about what the existing workarounds are, with "allOf". It discusses how "allOf"'s behavior with annotations is insufficient for documentation generation, and how "default" simply doesn't work in a meaningful way.
"$ref" is irrelevant- you can't add other keys to it, and using $ref inside of "allOf" vs using a literal schema inside of "allOf" are equivalent for these purposes.
What part of my discussion of the "allOf" alternative is unclear or unconvincing?
I will review it again.
Documentation generation is outside of spec. Whether the doc generator can understand $ref wrapped in allOf or not is the problem of the generator - extending the standard created for validation hardly seems an adequate solution to help doc generators.
"$ref" is irrelevant- you can't add other keys to it, and using $ref inside of "allOf" vs using a literal schema inside of "allOf" are equivalent for these purposes.
Again, please review the cases in the comment above (https://github.com/json-schema-org/json-schema-spec/issues/98#issuecomment-254064538) - $ref and using literal schema are not equivalent.
@epoberezkin
Documentation generation is outside of spec.
Um.. no. From the first paragraph of the JSON Schema Core specification:
JSON Schema is intended to define validation, documentation, hyperlink navigation, and interaction control of JSON data.
While we confirmed in #89 that documenting how multiple schemas interact in an API is out-of-scope, documenting individual schemas is explicitly in-scope. Note that I don't mean fancy full I18N/L10N end-user documentation, I just mean reading the schema and presenting it at a developer level.
I'm still not going to discuss your concerns about $ref and inclusion here as it distracts from the main proposal. If you comment on #66, I will address it there.
The way it is written, JSON schema itself is intended as a documentation for JSON data. Deriving some other documentation format from the schema (I assume that's what you mean) is something else.
JSON schema itself is intended as a documentation for JSON data
Right. This is the declared purpose of title and description:
Both of these keywords can be used to decorate a user interface with information about the data produced by this user interface.
An automated system cannot disambiguate conflicting values for these keywords when trying to figure out what to present in such a user interface. That makes schemas using these keywords difficult to re-use, which is the problem being addressed here.
Additionally, completely aside from documentation, the other metadata keyword, "default", is unusable in "allOf" without disambiguation.
Please try to address these problems
Ok, I will spend more time to understand this issue, I can be missing something here :). Will come back with the questions. In the meanwhile, I am writing the comment re $refs in #66.
@epoberezkin - while the allOf + $ref implementation makes theoretical sense to me, it may be difficult to implement simply because the behavior is not explicity defined in current versions of JSON schema, and causes today's tools to break.
I think it would be useful to survey the ease of use of both syntaxes with other developers. I find the $use ( source + with ) syntax to be much clearer - but obviously, that's just my little opinion, and I think it would be good to see what some other users think ;)
@epoberezkin now that #66 is resolved, any further thoughts? This is a high priority for me to resolve one way or the other for the next draft. The problem it addresses is a major pain point in a current project.
@robbie-mac this addresses your concern about $merge/$patch violating the intent of closing a schema to extension that you raised in #15 .
@seagreen I believe you were also concerned in #15 about $merge/$patch being overly powerful- this proposal essentially restricts $merge down to a form that cannot affect validation.
@awwright any thoughts here? This was part of the email thread about different re-use use cases that you seemed to like when I first posted it. Although I'm not sure if that was in response to this particular proposal or one of the other use cases in the thread :-)
And I might as well tag @Relequestual and @jdesrosiers while I'm at it to see if this resonates with anyone else who has been active here recently.
As noted, I'd really like to get this either into the next draft or definitively rejected.
I've re-read the thread.
I understand the issue that some processing tools may face. I think that the problem of tooling of any kind should NOT be addressed in the standard that has validation as its primary purpose, as otherwise we will be facing feature creep from users of different tools (e.g. UI generation - why not add something to support them here too? How UI generation is worse than documentation generation? I am not suggesting it, I am using it as an illustration that documentation generation is equally narrow and out-of-scope concern).
I think either of the following is a better option:
Also bear in mind that $use is more complex to define than $merge - it requires maintaining the list of keywords that are taken into account (maintaining = writing, extending, discussing, etc.), while $merge is quite simple - it can be defined once and forever.
@handrews: Thanks for mentioning me here. This would definitely be an improvement over $merge/$patch from my perspective.
That said, I'm actually still against it, but I'm not sure you should take my thoughts into account.
To explain...
My use case is super weird in that I need a way to write JSON schemas that do structural-only validation and absolutely nothing else. I won't even be using "title"
or "description"
-- from the perspective of my application that's metadata about the schema, and I already have a standard, separate way to store things like that.
So as you can I'd prefer that the core, structural part of JSON Schema stayed extremely focused on just validation. I think there are reasons this is a good idea -- consider in how many situations things like "title"
are tracked somewhere other than in the data itself, for instance in filesystems having a "title"
in the file's data and also having a file name is redundant.
That said, I appreciate that you have pragmatic use cases for both "title"
and a way to modify it, and that that will probably (and perhaps should) override theoretical/niche considerations about keeping the focus on structural validation.
This is a high priority for me to resolve one way or the other for the next draft.
I'd missed this line. Obviously you need to be taken care of one way or another. Would a "JSON Document Schema" similar to the JSON UI Schema idea be too crazy of a thought? It could include things like "title" and "description", and also a powerful way to override them.
That said, I appreciate that you have pragmatic use cases for both "title" and a way to modify it, and that that will probably (and perhaps should) override theoretical/niche considerations about keeping the focus on structural validation.
default
is actually the most important, although title
and description
are important here as well. But default
is actually something I use in APIs at runtime even when there's no use for title
or description
.
Would a "JSON Document Schema" similar to the JSON UI Schema idea be too crazy of a thought?
I would not object to such a thing. Basically splitting the "meta-data" keywords out of validation (where they don't really belong anyway) and expanding that. I need to be able to easily access the default
value while working with instances at runtime. That's the only potentially tricky point I can think of if this is split out. Do I have to bring in the entirety of JSON Document Schema just to use defaults? And if so, how easy/hard is it?
I would not object to such a thing. Basically splitting the "meta-data" keywords out of validation (where they don't really belong anyway) and expanding that.
Oh man, I didn't actually expect anyone to agree with this. This would be so cool!
I need to be able to easily access the default value while working with instances at runtime. That's the only potentially tricky point I can think of if this is split out. Do I have to bring in the entirety of JSON Document Schema just to use defaults? And if so, how easy/hard is it?
I can't imagine this would be too difficult (obviously handling things like "title" is no trouble at all). I haven't been following all the open issues though, so maybe more complicated documentation-related keywords are being added. If that's the case though it would make having a JSON Schema Document Extension even more important.
Another appealing thing about making the documentation stuff an extension is that we could push all of the I18N/L10N stuff there. I think. I kind of like the idea of:
JSON Schema core (bare minimum necessary to use any vocabulary) plus vocabularies for: validation, documentation, hypermedia, ui generation
Right now validation has just a bit of documentation stuff but there are arguments over what, exactly, it's there for. This probably calls for an issue of its own- @seagreen do you want to file one for this extension? I am supportive but have too many other things going on to champion another big proposal.
As of https://github.com/json-schema-org/json-schema-spec/issues/160#issuecomment-262642122 and https://github.com/json-schema-org/json-schema-spec/issues/158#issuecomment-262351269 I think @epoberezkin I are coming to an agreement that both $merge
/$patch
and $use
belong outside of JSON Schema.
I will be happy to close this as out of scope once #15 ($merge
/$patch
) is also closed, since this issue can easily be implemented as an independent preprocessor.
I'm usually using JSON Schema as part of an Open API definition - so I'm fine with this, provided it this extension would be included in the Open API specification. Do you know at what level that is defined?
I don't know much about Swagger / Open API's processes, but you could certainly lobby to include it there if it doesn't go into JSON Schema or some extension vocabulary. I'd use a different prefix than $
but that would be up to whichever group wants to implement something like this.
It's not entirely trivial to do strictly as a preprocessor step, as you would need to copy and modify things that are otherwise "$ref"'d, and you can get into trouble with circular references that way, but it's plausible. But if Open API is integrating JSON Schema processing into a larger system, then they could resolve the "use" statements lazily which would make it quite easy.
I don't think any of the team is in contact with Open API. Would be good to know a point of contact who is willing to discuss things as needed.
@webron or @fehguy are usually a good place to start for Open APIs
I'm all 👂!
I'm here too. Will have to spend some time reading the ticket. I was also going to reach out regarding some other issue we have with the JSON Schema website (so, less technical).
@fehguy @webron this is a documentation feature that might be better implemented outside of JSON Schema proper, and could be implemented as a just-in-time preprocessing step quite easily. Read the first comment but you probably don't need to go through all of the subsequent discussion as it's more about various objections related to putting it in JSON Schema.
@webron JSON Schema website bugs should be filed at json-schema-org/json-schema-org.github.io
@fehguy @webron although note also https://github.com/json-schema-org/json-schema-spec/issues/136 which might add a JSON Schema vocabulary related to documentation so this idea could find a home there as well.
It will probably take more than just reading the first comment, as I haven't followed the changes since draft 4 closely, and I don't know how $merge
behaves.
$merge
just applies an arbitrary application/merge-patch+json document to a schema. The problem with that is that it can change validation results in arbitrary ways. It's like monkeypatching code- occasionally useful or necessary but more for weird test cases than normal usage.
$use
is an attempt to restrict that idea to annotation keywords without affecting validation.
Note that we're pretty close to consensus that $merge
/$patch
will not go into JSON Schema.
@handrews I thought you reached the conclusion that this should not be included as well.
@epoberezkin I thought that was already apparent from this issue.
If you think there's a better solution you might consider closing this out.
Does #151 seem to describe the problem being targeted here?
A related proposal has been to allow schema templates, or some way to generate a schema based on an argument (I can't think of a compelling one, but maybe allow only a multiple of a number). But usually this can just be done with an "allOf" intersection.
@awwright no #151 and #119 are addressing the same problem. This is strictly a documentation/annotation problem with a much simpler solution.
I will only close this issue out if #15 is closed. If there is any possibility of #15 being accepted, this is a much more focused solution and I will argue for this solution instead.
@awwright I'd like to hear your thoughts on whether this issue ($use
) is in-scope for JSON Schema or not. That's really the decision to be made (and if it's not in-scope, then I can't see how #15 would be in-scope either).
@epoberezkin I thought that was already apparent from this issue.
just wanted to make sure, probably wasn't reading it attentively enough ;)
I'd be very happy to have all these ideas (including #15, #151, #119) out of scope. They all can be separate I-D. This way we'll keep it simple here.
Once draft 6 is out of the way, we can brainstorm some better ideas (I have none at the moment), but I'm happy to kill them all for now.
JSON Schema forms a major part of the Open API Spec, which seems to be gathering steam as the documentation language for Restful APIs, so I'm very glad to see you all discussing together :grin:
Specifically, in this example, @handrews proposed $use extension that would be very helpful for implementing Open API specifications. For example, today our underlying system returns a generic price object - with amount, currency, etc. that is shared across a wide variety of APIs (Flight Search, Hotel Reservations...). However, the amounts often have a different functional meaning in each API, which needs to be documented. In Open APIs 2.0, we can't reuse the same response object for this, because JSON Schema draft 4 requires them all to share the same description. Thus, we end up with a lot of copy-paste implementations of the same underlying object across our specification.
$use addresses this issue by allowing the reuse of all fields of a referenced object, with a few overwrites. If this were supported in the upcoming Open APIs 3.0 spec, it would help us to avoid repeating the same response object over and over in our specs just in order to have different descriptions for just 1 or 2 fields each time.
However, it sounds like $use will be an extension to JSON Schema, rather than part of the specification itself. If this is the case (@webron @fehguy ) do I need to open a new issue in the Open APIs issue tracker to request it be considered for the Open API 3.0 draft?
@tadhgpearson yes that'd be the right next step. Thank you
@tadhgpearson @fehguy It sounds like @awwright might be open to this as a part of JSON Schema, although I can't really tell from his comment so let's see if we can get him to elaborate :-) I know @epoberezkin is against, but the scenario that @tadhgpearson describes is exactly the problem that I encountered over and over at Riverbed (and why we implemented the too-powerful $merge
but only ever used it the way I describe $use
). I see the same problem in the project that I recently joined.
So this is a widespread problem, and I would argue a much more practical and easily solved problem than the mess over people using "additionalProperties": false
for error-checking while wanting to also expand the property set during re-use (which is what #151 and #119 are trying to solve).
If there is a lot of evidence that this is needed, I might want to push for it in JSON Schema a bit more.
@tadhgpearson I personally see Swagger as a branch off from JSON-Schema. I find naming it the Open API Specification somewhat presumptuous. Winning hearts and minds and all that. I'll not say any more or I'll go into full rant mode. Sorry I haven't actually reviewed the issue.
@Relequestual lol. It is indeed. It supports some subset of JSON schema spec.
@handrews we are using $merge for actual validation extension here. Having way to extend meta and not actual validation would be very disappointing... And I think both are hacks to be honest. We need to come up with some alternative extension mechanism...
Aha - so @fehguy - I think this issue is already open for consideration in the next iteration of the specs through https://github.com/OAI/OpenAPI-Specification/issues/556
Having way to extend meta and not actual validation would be very disappointing...
meta isn't "extended", which is why it is addressed differently from validation. This issue is not about refusing to extend validation, it's about recognizing that the use cases are different. Meta data needs to be overwritten depending on context- the original values for description or whatever just disappear in the new usage.
Validation rules need to be combined in some way that respects all sources for the combination. $use
is not about preventing us from solving that problem it's about splitting off the very different problems related to annotation and solving them in a way that makes sense for them.
This looks like a reasonable proposal to me. I'm all for this.
I'm concerned that this kind of keyword -- anything that says "like X, but with Y" -- is too opaque. This makes certain kinds of use cases difficult to reason over.
If I've automatically generated a SQL structure from a JSON Schema, and one of my fields "Y" is a boolean... what happens when I say "Like field Y, but type:"integer"
instead"? However, this isn't any worse than just making a copy of the schema in question.
Adding constraints can already done with "oneOf". So this would only be needed for removing constraints from an existing schema.
What are some use cases for needing to remove a constraint from a schema?
@awwright the whole point of $use
is to make it impossible to do what you are worried about:
what happens when I say "Like field Y, but type:"integer" instead"? However, this isn't any worse than just making a copy of the schema in question.
$use
MUST NOT be used with validation keywords. It is only used with annotation keywords (which have no affect on validation).
This issue does not remove (or add) constraints, so use cases for such are not relevant. That's not what this does.
Use Case and Motivation
This is one of several proposals to come out of the analysis of the "ban additional properties mode" vs
$merge
/$patch
debate and the various use cases that fed into that.One common use case identified was specifying information about a type's usage at the point of use rather than as part of the initial type definition. In some cases, it is reasonable to provide some generic usage information in the type definition which serves as default usage documentation, but needs to be overridden in some uses with more specific information.
Additionally,
default
cannot meaningfully appear in multiple branches of logical keywords such asallOf
, as there is no sensible way to choose which value is correct for the specific usage. Conflicting validations in anallOf
simply fail theallOf
, but conflictingdefault
values are just unusable, independent of validation.Prior experience
This particular use case has been a major pain point on both large-scale JSON Schema projects with which I have been involved.
One implemented
$merge
, which was very effective but (as several people have noted, and as I now agree) it is too powerful as it (and$patch
) can literally convert any schema into an arbitrarily different schema, or even into a non-schema JSON value. In this project,$merge
was used to override default values fordefault
,title
,description
, andreadOnly
, or disambiguate among several possible values for those keywords.The other project ended up redefining many types that should be sharing a definition. In some cases, nearly every schema has had to redefine a particular sub-schema just to change the title and description.
Preserving independent validation
In order to preserve the self-contained nature of validation, usage overriding only applies to the keywords in the "Metadata keywords" section of the validation specification (
title
,description
, anddefault
) plus usage-oriented hyper-schema keywords (readOnly
andpathStart
).Breaking self-contained validation has been the main objection to other proposals in this area, either through overly powerful keywords or changing validation modes in ways which are not reflected in the schema itself. This is the only proposal to date that solves this problem with no impact on validation whatsoever.
Existing workarounds
It is possible to use an
allOf
with a reference to the type schema and a schema that includes only the point-of-use annotations. This is tolerable when reading the schema, but is not good for documentation generation. Each schema within anallOf
should be self-contained, which separates the documentation from the thing it's documenting. Additionally, if default annotation fields have been filled out, a documentation tool that attempts to extract and combine annotation data from allallOf
branches will just produce conflicting documentation.Finally,
allOf
is nonsensical when combining multiple schemas usingdefault
,readOnly
, orpathStart
, as there is no reasonable way to choose which value to use.The proposal:
$use
The keyword
$use
would be an annotation/hyperschema keyword indicating that a type (via a$ref
) is being used with additional annotation and/or hypermedia for this specific use.$use
is an object with two required properties and one optional property; the format for the required properties is borrowed from the$merge
and$patch
proposals:source
MUST be a schema, and will nearly always be a$ref
as there is no need for$use
if you have the literal schema defined in line. Required.with
should be applied to the resolved source as anapplication/merge-patch+json
instance, which MUST NOT affect any JSON Schema-defined keyword that is not explicitly allowed by the proposal. Required.$use
to their extensions or not as they choose. Required.JSON Merge Patch considerations
Declaring the
with
object to follow JSON Merge Patch semantics allows implementations to use existing libraries, and frees us up from needing to debate yet another schema combination format.However, a notable limitation of JSON Merge Patch is that you cannot overwrite something to
null
, as specifying anull
value instead deletes the key entirely. A workaround for this that is consistent with existing JSON Schema behavior is to definenull
somewhere (perhaps right in the$use
object since it does not forbid additional properties) and then$ref
it. We would need to declare that$use
is processed before$ref
. Which is fine because$ref
often needs lazy evaluation due to circular references anyway.Elsewhere, there's been an assertion that
$ref
must always resolve to a schema. This would require breaking that assumption. Since the assumption was not present in Draft 04 (in which JSON Reference was an entirely separate specification), I think this isn't too unreasonable. A compromise could be that it can reference either a schema or null.If $ref-ing null doesn't gain acceptance, another option would be an additional keyword in the
$use
object listing JSON pointers to properties that should be set to null."$" in name
I chose
$use
rather thanuse
on the grounds that keywords that manipulate the schema itself should stand out (I will be filing a more comprehensive issue on the use of$
later as it comes up in yet another not-yet-filed proposal, plus there is an open issue to consider$id
in place ofid
).Example
This shows that the type can be concisely defined in a shared location, with a clear usage object applied to it.
Meta-schema considerations
This could be added to the meta-schema by moving most of the keyword definitions into the meta-schema's
definitions
section grouped by type of keyword, and definingwith
as a schema minus the forbidden keywords.