Closed awwright closed 5 years ago
+1 to this. The point here, I assume, is for documentation and perhaps to allow implementations to issue warnings, rather than to affect validation.
Interesting proposal.
I know that JSON Schema isn't specifically for API's, but we should consider how API's might version their endpoints. Could this proposal actually be part of a seperate doc about API driven information. For example, included in deprecated, how it's deprecated and if it's replaced, and possibly other upgrade path type data. Very much a straw man comment, not well thought through.
@Relequestual I'm not sure how far we want to chase the versioning idea here, but my approach to API evolution (rather than versioning) is to only change the version of the schema, and retain compatibility by (as much as possible) avoiding both required
and "additionalProperties": false
. I use the following rules:
Interesting. Most of the time our API (and others I've worked on) have needed "new versions" because of new fields or new allowed values. Sometimes that's backwards compatible, but often it isn't for me, as much as I'd like it to be.
As long as you avoid setting additionalProperties to false, then new fields should be fine (unless you're generating strictly typed code off of the schemas without any provision for unrecognized fields, but I avoid that in favor of more generic client libraries).
Enumerations get tricky, but that's where the "if it still validates you can still use it" rule comes into play- technically the new version is not fully compatible but most instances will validate against either version.
My approach to API evolution is to do as much as possible to allow users to keep using the API as they have been unless their specific usage is broken by the update. It requires some design discipline, and the willingness to add a new resource as a replacement and deprecate the old instead of doing a major version change on the entire API and breaking a bunch of clients that don't actually care one way or the other.
Admittedly, I haven't actually had a chance to prove that this can work long-term, but it's something I'm trying with the current project I'm working on.
I think Protocol Buffers allows for flagging deprecation. If someone has time, may be worth seeing what that system allows for.
@awwright please don't close issues while the corresponding PR is still open. PRs are not always accepted.
@handrews A PR is just a type of issue on GitHub, and I can indefinitely re-edit it if necessary, or re-open this issue of mine, or someone can file a new issue for me if it really comes down to that
@awwright This is yet another instance of you randomly choosing how to use GitHub in whatever ways that suit you, no matter whether anyone else finds it confusing. Some issues you want to keep open, some you don't, we just get to guess at what you'll complain about, and if we want to keep track of things in some consistent way, we're out of luck if you decide not to.
All of the discussion about wanting a clear and defined process that the community has had clearly means nothing to you. Your response to any complaint or request for clarity is always just "well I see it this way so that's what I'm doing".
@handrews It's my issue! Anyone is allowed to close their own issue for any reason. There's nothing special about it.
I'll re-open this issue as I agree issues that do not have accepted pull requests should not be closed.
It seems like there's quite a lot of discussion and decisions to be made on this issue. Specifcially the comments in the PR review: https://github.com/json-schema-org/json-schema-spec/pull/173#discussion_r91867794
It's clear there are a number of unresolved abiguities.
In a root schema with a "self" link is means that the resource itself is deprecated? But it's still there currently?
In a root schema without a "self" link it means....?
In a property subschema (in any of the property keywords) it means that all matching properties are deprecated but will still be present?
In a schema applying to a single position in an array/tuple ("items" is an array) it means...?
In a schema applying to multiple array positions ("items" as one schema, or "additionalItems") it means...?
In the schema for "schema" or "targetSchema" it means...?
This concept is not ready for a PR. There are far too many cases that need discussing and a coherent view of hypermedia resources that need to be developed before we can just toss this in.
I've set this as priority: low. People aren't crying out for it, and it doesn't fix an existing bug. It would be nice to go in the next release if the above issues can be resolved, but otherwise it shouldn't become a blocker.
I just ran into the same wall. I want to denote in a schema that a property will get removed in a later version.
So currently I have a schema with id: https://example.org/schemas/1.0/payload-schema.json
.
Now we are releasing the next version adding new properties and types named https://example.org/schemas/1.1/payload-schema.json
but also want to deprecate old stuff which got superseeded by the new types to be ultimately get removed in https://example.org/schemas/2.0/payload-schema.json
(long term future).
Simply as all document types evolve over time you need a migration phase (1.x schema series in above example).
So we want all 1.x documents to validate against all 1.x schemas. This is possible, as pointed out by @handrews above when "avoiding both required
and "additionalProperties": false
, all types are preserved and only non mandatory new stuff gets added (See also SemVer).
Now we also want validation to break when the deprecated stuff is validated against the 2.x schema.
All this is currently possible from validation POV but we lack a way to tell the user of the schema (developer) about the migration phase. Of course this can be handled by defining proprietary keys, used by whatever validator one is using or attaching a new validator to validate twice, but this will cause fragmentation of standards. Therefore defining this right in the spec would be better - or as @Relequestual stated "nice to have".
Yet, I wonder how big need is as next to no one seems to feel the need (with me being the second guy in this thread apparently). The fact that XML XSDs also lack a standardized way to define something like this makes me wonder if the schema really is the right place to add this but I fail to see a better alternative (which can be utilized by CI and yet not break production).
@discordier we've generally had trouble finding anyone who has even used hyper-schema enough to participate in discussions. So don't take the low count as necessarily indicating lack of support. I encourage you to comment on any hyper-schema issues and pull requests you see- it is a challenge for us to decide on directions when only two or three people have opinions they are comfortable expressing. Here are the issues and PRs (including but not limited to hyper-schema) that we're targeting for the next draft: https://github.com/json-schema-org/json-schema-spec/milestone/2
@Relequestual marked this low priority because we're trying to quickly resolve enough things to publish Draft 6 and this is both not essential for that immediate effort and because there are several questions without obvious answers. I'd personally be in favor of getting this into Draft 7, though, if we can't sort it out in time for Draft 6.
I'm only familiar with XSD in the most basic sense, but I gather it's more analogous to JSON Schema Validation rather than JSON Hyper-Schema. I definitely think this feature belongs in hyper-schema assuming we can sort out the details.
+1
@lidio601 Please avoid +1 comments. You can thumbs up any comment you agree with. Thanks.
I would also like a deprecation system, or some type of meta-tagging for fields that are being added, replaced, and deprecated. As others mentioned before, it should have a date to accompany it.
Howdy! Where did this conversation end up? I know deprecations are something that Open API v3.0 has a little of, and they're considering improvements for v3.1.
It would be amazing to see some collaboration on this issue, so we can help reduce the now very narrow gap between JSON Schema and Open API v3.x.
@philsturgeon everyone here has been kind of caught up in other things, but there are signs of life and we do hope to have another draft out before the current ones expire in October.
The deprecation idea is popular so I hope it makes it into the next draft.
The last PR stalled because there were many ambiguities and we did not have the time and focus to resolve them. The use cases I see (which may not be what anyone else agrees with) are:
"deprecated" in a Link Description Object (LDO) deprecates the link (this could mean that the target resource is deprecated, or just that the connection between the resources will be removed).
"deprecated" in an LDO with "rel": "self" indicates that the resource is deprecated
"deprecated" inside of an object field schema (values within "properties" or "patternProperties", or the value of "additionalProperties") means that the relevant properties are deprecated, but the containing object is not.
The semantics of deprecation and array elements are unclear, so I'd avoid that on the first pass. If use cases emerge we can work through them.
Deprecating a schema that is not an object field schema has no clear semantics to me. In my opinion, only LDOs and fields schemas are reasonable use cases for deprecation (although I may be missing some obvious other ones).
We may be running into an issue similar to "required" where it was decided that it doesn't make sense as a boolean, but only as a list of properties on an object, so it was turned into an array listing key names that are required.
Maybe the same thing is true of "deprecated", but I'd also be interested in providing a rationale as to why the property is deprecated, maybe at least a link to a document.
I vaguely remember a proposal to add a link relation doing exactly this, pointing to a document that explains how the origin resource is deprecated, but I can't find any evidence of such a thing, maybe I invented that.
Absolutely. So this is exactly the same conversation we're having over on @dret's Sunset header draft.
Outstanding questions of where to send humans and computers are being discussed here.
Sunset will take a date, and optionally a URL to some metadata, and thats about it.
Some people like versions, some people like dates, but for Sunset it's going date only at this point. Versions are too app-specific for a header, but JSON Schema could probably have some opinions on version perhaps. Not sure.
Either way, theres some crossover here.
just to clarify: Sunset
as currently defined (feedback very welcome) only takes a date and nothing else:
Sunset = HTTP-date
https://github.com/dret/I-D/issues/60#issuecomment-319739804 mentions the possibility of linking to additional information using a link relation, but that's not part of the current draft.
@dret @philsturgeon I'd prefer a date over a version because it's more flexible. Some APIs use evolution techniques rather than versions, and that's my preferred approach. I'll have to catch up on the Sunset
discussion. I like having link relations for things but I haven't read the rationales for or against yet.
@awwright Agreed on this probably making more sense as a list parallel to required
. "What about this object is deprecated?" would be a lot easier to reason about than "is this schema part of a thing that can be deprecated and if so is it?"
It looks like the Sunset
header and an LDO-level deprecated
would work together, while a field-level deprecated
would be a different thing.
On 2017-08-03 09:25, Henry Andrews wrote:
@dret https://github.com/dret @philsturgeon @awwright https://github.com/awwright Agreed on this probably making more sense as a list parallel to |required|. "What about this object is deprecated?" would be a lot easier to reason about than "is this schema part of a thing that can be deprecated and if so is it?"
i am confused by this whole discussion. to me it looks like this:
if i make things mandatory in my structure, that's a promise for these structures to be there all the time. i have to keep the promise.
if i don't want to make this promise, then i make things optional and this makes it clear that things may not always be there.
the proposed mechanism seems to aim at adding a facility that allows me to say when i am planning to break a promise that i previously made? what's the exact benefit of this, and how is it any different from making things optional from the start and therefore requiring clients to be able to handle the presence or absence of a structure?
@dret It sounds like we may have to differentiate how this is used for (1) defining expected user input, and (2) annotating the output of a JSON API. What "deprecated" means in these scenarios may be slightly different.
I've been understanding it as an API value that may not be provided in the future, and that you shouldn't use it, or should adjust your application to rely on some other value.
On 2017-08-03 13:39, Austin Wright wrote:
@dret https://github.com/dret It sounds like we may have to differentiate how this is used for (1) defining expected user input, and (2) annotating the output of a JSON API. What "deprecated" means in these scenarios may be slightly different.
and there are many different ways in which these differences can manifest, which makes it hard to capture them in a predefined model.
I've been understanding it as an API value that may not be provided in the future, and that you shouldn't use it, or should adjust your application to rely on some other value.
so what makes it different from something that's optional? other than you might start labeling it "optional" not from the very start, but at some point in time during the lifetime of your API?
@dret I prefer to avoid declaring anything as required and build my tooling appropriately. However, declaring fields to be required and then removing them later is an extremely common real-world occurrence, no matter how I feel about it as a design choice.
On 2017-08-03 17:04, Henry Andrews wrote:
@dret https://github.com/dret I prefer to avoid declaring anything as required and build my tooling appropriately. However, declaring fields to be required and then removing them later is an extremely common real-world occurrence, no matter how I feel about it as a design choice.
interesting. if you build things your way, there's no need to communicate deprecation: everything can go away anytime anyway, and clients have to be prepared for that. afaict, that makes it hard to build applications that are based on some minimal core contract, but if that's something that works for you, it seems that you are all set!
if that's something that works for you, it seems that you are all set!
In practice, I doubt I can manage it all the time, but it's something I am going to explore and see how much I can get away with :-)
Ideally, the only reason to drop a field is that it is legitimately no longer relevant. My preference is that optional fields are always absent when not relevant, never present with a null
value. If clients are built to facilitate working with fields that may be absent, then a certain category of "removed" fields are no longer a problem.
Obviously some fields can't work that way. If you're configuring something related to DNS, you need the domain name. Otherwise nothing makes sense at all. But there are other things that are a lot less critical. Hopefully over the next six months to a year I'll be able to get some real-world validation (or invalidation) of various ideas in this area.
Is this hyper-schema specific? Or rather, is it out of scope for JSON Schema?
I know I have encountered at least one case where configuration settings have changed with Visual Studio Code (which provides a schema for its configuration file) where they potentially could have captured the change in a deprecation message rather than leaving users wondering why the file no longer validated.
This, readOnly
, and the media
object all at least theoretically are usable outside of Hyper-Schema. media
could arguably be viewed as extended validation (in the same way as format
, in that it conveys information that could be but need not necessarily be validated in full).
Alternatively, Hyper-Schema is not limited to APIs. This is one of the key points that confuse people about Hyper-Schema. It is a hypermedia format, not an API description format. So just like people often use hypermedia formats like HTML (or even more commonly, Markdown) to store local documents, there's no reason you couldn't use a hyper-schema with Visual Studio configuration settings.
there's no reason you couldn't use a hyper-schema with Visual Studio configuration settings.
I must be missing something. Where do the semantics come from?
For example, if I am validating a config (or other) file against a schema and the file uses a deprecated property I think the validator SHOULD emit a warning about the deprecated property. If "deprecated" is only part of hyper-schema, how would a JSON Schema validator know to issue a warning?
You would obviously need a hyper-schema-aware implementation, or at least one that makes it easy to register the keywords you care about as extension keywords.
Hopefully with draft-06 and beyond, hyper-schema implementations will become more common. My work in that area got kind of put on hold for a while but I am getting back to it. But the most popular validators tend to allow extension keywords.
We also haven't specified what an implementation SHOULD do, and would probably leave it fairly open. Some systems may want it more as a documentation thing, others may want runtime behavior. This is similar to default
, which is why many implementations ignore it by, um.. default :-) but provide options to write missing defaults into the instance during validation. Implementations can always layer functionality on top of the spec.
I think that is a valid view. Though, I would throw in a word of caution about extensions and fragmentation, in that one of the advantages of JSON Schema as I see it is that (on the whole) it is well-defined and compatible. Something that is implemented as extensions (and then ones that are loosely defined with respect to behavior), makes me hesitant to use it unless I really need the functionality because the number of implementations supporting my schema suddenly drop and the number that support it but no not behave how I intend may be non-zero.
In response to my original question though, it sounds like it is considered out of scope for JSON Schema.
In practice, I doubt I can manage it all the time, but it's something I am going to explore and see how much I can get away with :-)
keep us posted.
Ideally, the only reason to drop a field is that it is legitimately no longer relevant. My preference is that optional fields are always absent when not relevant, never present with a |null| value. If clients are built to facilitate working with fields that may be absent, then a certain category of "removed" fields are no longer a problem.
agreed in principle. in practice, it is very hard to build anything meaningful when you can count on nothing. my guidance is almost opposite to yours: design and define a stable core and commit to it, allowing clients to rely on that core.
http://dret.typepad.com/dretblog/2016/04/robust-extensibility.html
Obviously some fields can't work that way. If you're configuring something related to DNS, you need the domain name. Otherwise nothing makes sense at all. But there are other things that are a lot less critical. Hopefully over the next six months to a year I'll be able to get some real-world validation (or invalidation) of various ideas in this area.
like i said, keep us posted. the art of defining and evolving APIs is definitely something that will only become more important as we see things becoming increasingly connected. how to do it well and robustly is important, and a space where we need more experience and guidance.
OK, trying to push this towards a new concrete proposal:
Going from @awwright's observations that this is more like "required"
than anything else, I see two options for the object field case:
"deprecated": ["foo", "bar"]
, which is equivalent to booleans in individual schemas"deprecated": {"foo": "2017-09-01", "bar": "2018-01-01"}
, which is equivalent to date-valued fields in individual schemasAs with #117 ("patternRequired"
) we would likely need "patternDeprecated"
(and "additionalDeprecated"
?). See also #364 (applying this object-level approach to "readOnly"
).
I slightly lean towards the object form despite probably needing more keywords, as it then parallels the Sunset
header approach of using dates (assuming @dret is still going in that direction).
For the link/resource case, I'd like to try to handle that through #296 (link usage hints / target attributes) which could then piggy-back on the Sunset
work (assuming @dret thinks that is valid). While there would not be a Sunset
header specified for all protocols / URI schemes, I think there is a general question to be answered in #296 about re-using useful hints outside of their original specification.
I feel like we should be able to make a decision on the object field approach and get that to PR stage, and either handle the link/resource case in #296 or start a new issue to track it on its own.
As for exactly how this is specified for required vs optional fields, I think that will be more easily discussed given a PR with actual wording. We don't entirely nail down what required and non-required fields actually mean (is required a permanent commitment? it depends on the implementation, and is outside of JSON Schema's scope).
Thoughts?
For such a simple concept, this turns out to be difficult to get right (two PRs by two different contributors have now been retracted).
@adamvoss brought up the excellent idea of a deprecationMessage
functionality of some sort.
Looking at the OpenAPI proposals that @philsturgeon noted, while a lot of their operation-based approach doesn't translate well to hyper-schema, there are other useful concepts like indicating a replacement.
So I wanted to bring this back for a bit more open discussion.
I will comment again soon with specific ideas, although anyone should feel free to jump in on this of course.
When I was talking to @dret about the Sunset header (for entire-endpoint deprecations), we came to the agreement that probably a message was not entirely necessary.
Instead a link is provided via Links with a rel=sunset, and that is either a URL to the new endpoint to use, or a link to some documentation or a blog. Whatever that link is, it will be a lot more useful than whatever was jammed into the reason, because for entire endpoints.... well do you care?
Human written messages like "We love our customers very much, and decided to listen to their feedback on some of the..." are no good, and the other end of the spectrum is just "Bugs and app updates", neither of which is useful.
Instead the link approach allows clients to build a simple formatted message:
"Endpoint #{env.url} is deprecated for removal on #{datetime.iso8601}. Read more at #{sunset_link}"
This may not be the way to go for JSON Schema. I see JSON Schema and GraphQL as having a lot in common, and they do offer a reason:
https://github.com/graphql/graphql-js/pull/384
This adding of a reason here could easily outline for humans that foo
is deprecated and the reason is just: "Bar is better because X".
I feel like #391 is headed in the right direction, but would adding deprecated
and deprecatedMessage
get too noisy?
If not, can we go a little further and add a deprecatedUntil
or something? The dates to me really are more important than the human words.
@philsturgeon I like the link relation for per-resource deprecation. Assuming PR #383 "targetHints"
is accepted, that would then look something like:
{
"rel": "some=old-thing",
"href": "path/to/old/thing",
"targetHints": {
"sunset": [
"Sat, 31 Dec 2018 23:59:59 GMT"
],
"link": [
["/uri/path/to/sunset/resource", {"rel": "sunset"}]
]
}
}
Maybe I confused things by bringing up sunset, im not sure how targetHints work and dunno if I'd want to overload the term Sunset with stuff.
Are targetHints designed to hint towards potential response headers, or are the request header orientated? I see in the PR they talk about Allow (req) so these resp headers seem out of place in this form.
@philsturgeon maybe read the targetHints PR? It includes other examples :-) [EDIT: or maybe I should wait to be snarky until after checking on confusing mistakes in my own PR that you already read... oops, my apologies, maybe I should actually read comments]
But briefly: "targetHints"
are basically JSON-serialized response headers. Like "targetSchema"
and "mediaType"
they are non-authoritative. Schema authors can provide them as an optimization (and in the case of HTTP, as a convenience to avoid having to figure out whether something is discoverable over HEAD vs OPTIONS).
So that example just means "if you did a HEAD on this, you would very likely get a Sunset
header with this date/time, and a Link
header with this URI and relation type."
Request headers are described with schemas instead of values, and are covered in the aptly-named headerSchema
PR #390
@philsturgeon
I see in the PR they talk about Allow (req) so these resp headers seem out of place in this form.
Unless I am really confused Allow
is a response header? Admittedly, I did confuse Accept
and Content-Type
the other day, so it's possible that I'm really confused...
Unless I am really confused Allow is a response header?
Gah sorry brain frazzled. I did the same thing and read Allow as Accept and got messed up. Ok, let me do my homework on targetHints and see if it lines up with the right sorta thing.
targetHints may well solve endpoint deprecations (letting you know its likley before you hit it and find out) but regular old JSON Schema still needs to have deprecation, message and date to let folks know a field is about to vanish too. Right?
Gah sorry brain frazzled.
well that clearly makes two of us today! :-D
regular old JSON Schema still needs to have deprecation, message and date to let folks know a field is about to vanish too. Right?
yes, and that's the main focus of this issue and both of the PRs that attempted to address it. We'd brought up resource-level deprecation as well, but certainly for draft-07, I'd like to give "targetHints"
a chance to solve as many problems as it can with the generic mechanism, and only design specific approaches when that fails. There's a very larger set of things that can be conveyed in response headers and arguing over each individually hasn't really gotten us far.
Anyways, yes, field deprecation needs a totally separate mechanism.
As for ideas on per-field deprecation, I see a few options:
{
"properties": {
"foo": true,
"bar": true
},
"patternProperties": {
"^biz": true
},
"deprecated": {
"foo": {"date": "2017-12-01", "message": "...", "replacedBy": "??not sure what to put here"}
},
"patternDeprecated": {
"^biz.*stuff$": {"version": "2.0", "reason": "..."}
}
}
{
"properties": {
"foo": true,
"bar": true
},
"patternProperties": {
"^biz": true
},
"deprecated": {
"foo": ["2017-12-01", "message of some sort"],
"bar": ["2018-01-01"]
},
"patternDeprecated": {
"^biz.*stuff$": ["2.0", "message for this one"]
}
}
The #173 proposal, where deprecated
applies to a sub-schema made sense to me and I don't understand the objections. The obvious interpretation is that any matches to a sub-schema with "deprecated": true
should be interpreted as "this JSON file is valid according to the current schema, but will no longer be valid in a future revision of the schema". This allows deprecation due to changes due any restriction that is expressable in json-schema. In the most complex cases (eg, changes to dependencies
, changes to regexes in patternProperties
) it might be necessary to do something like:
"oneOf": [ { <new_schema> }, {"deprecated": true, <old_schema>}]
or if
"oneOf": [ {<new_schema}, {"deprecated": true, "not": {<new_schema>}, <old_schema>}
It would be easier for schema authors (and probably it's no extra burden on implementations) to allow use of anyOf
and non-mutually-exclusive schemata here, with a rule to suppress deprecation warnings unless all matching branches are deprecated.
In either case, once the deprecation period is over and the old syntax is dropped, it's a simple update to the schema of deleting the branches with "deprecated": true
and removing any anyOf
that may have been rendered superfluous.
Common changes, such as disallowing a previously-allowed field can be done very simply, by adding "deprecated": true
to the relevant place, without any need for oneOf
/anyOf
.
Other deprecation-related fields deprecationDate
, deprecationMessage
, deprecationURL
, etc if decided necessary can live at the same level as deprecated
and the scope is obvious.
Unlike the #173 proposal, a #391 approach seems limited and ill-defined. For example what is the equivalent of:
{
"anyOf": [
{ "type": "integer", "minimum": 0},
{ "deprecated": true, "type": "number"}
]
}
(Ie, a previously unconstrained numeric field, where being other than a non-negative integer is now deprecated)
Create a schema keyword that specifies an instance (property in an instance, usually) as deprecated -- it shouldn't be used, but exists only for internal use, or for reverse compatibility with obsolete or old products.