Open darrelmiller opened 6 years ago
Following our discussions during past TSC meetings, here's a draft proposal for an "OAS Overlay" mechanism. The current proposal is equivalent to the "RAML Overlay" mechanism.
Summary of proposed changes:
Example:
openapi: 3.1.0
extends: https://raw.githubusercontent.com/OAI/OpenAPI-Specification/master/examples/v3.0/petstore.yaml
servers:
- url: http://localhost/v1
components:
schemas:
Pet:
required:
- id
- name
- size
properties:
id:
type: string
size:
type: string
In the example above,
servers
single url
defined in the source document -- i.e. http://petstore.swagger.io/v1
-- is replaced by the single url
http://localhost/v1
Pet
schema is altered as follows:
id
is changed to a string
type
, and as a result the target format
conflicting property is removedsize
of string
type
is added, and also added to the list of required
fieldsCould the URL for extends
be relative? That way I could work on a local copy.
Can multiple extends
be chained together so several documents that extend each other?
I can see a couple places where overlays would be useful to me:
Those are both good use cases, 2. particularly for the localization of API docs .
Could the URL for extends be relative?
Yes
Can multiple extends be chained together so several documents that extend each other?
Yes and that's an important point. I updated my comment accordingly.
I would have to reproduce the entire array, no? Is there a way to more precisely target?
That's right. Arrays are hard to work with when it comes to merging. Instead of replacing arrays, we could decide to concat or union the overlaying array with the target array but this would introduce another (worse) problem by making it impossible to rewrite arrays from scratch. We could also decide to have a more imperative syntax which would include some sort of merge rule to apply to the target document, like "add" or "replace", but this would be to the detriment of readability/usability (IMHO).
But since parameters are an array, in order to edit a single description, I would have to reproduce the entire array, no? Is there a way to more precisely target?
In the case of parameters
etc, the issue with targeting one element of the array can be worked around by using $ref
s to named parameters under the components
object in the base OAS document.
Thanks for clarifying, @jstoiko. What are some other use cases for merging that are more along the lines of signature changes? For example, a proposed change to an API design?
The crux of this for me comes down to identifying the key use cases. IMO, it is critical to make it easy to invite collaboration (e.g., the second case above, including your localization point), and it would help to keep the syntax simple.
As a thought experiment, say I wanted to update the limit
parameter's description in your sample, the easiest I could imagine is:
openapi: 3.1.0
extends: https://raw.githubusercontent.com/OAI/OpenAPI-Specification/master/examples/v3.0/petstore.yaml
operations.listPets.parameters.limit.description: How many items to return per request (max 100)
This is the simplest approach I can imagine, and it uses the otherwise optional operationId
as a shortcut to target the specific operation and treat the name
property of parameter elements as a pseudo-map. To support reuse, I could use YAML's indenting, so if I also wanted to add a default
I could use:
openapi: 3.1.0
extends: https://raw.githubusercontent.com/OAI/OpenAPI-Specification/master/examples/v3.0/petstore.yaml
operations.listPets.parameters.limit:
description: How many items to return per request (max 100)
default: 10
Note that we discussed merging of various sorts a lot in the JSON Schema project, and ultimately decided that merging of schemas violates a number of important design principles for schemas and should be avoided. We addressed the use cases with several different proposals for draft-08. This was a discussion across roughly five+ years (from original proposal around the time draft-04 was published in January 2013), two sets of spec editors (the ones that did draft-04, but split on the questions around merging and ultimately abandoned the project, and the set of editors including myself who revived the project about two years later), and 500+ github comments across 10 or so issue exploring different variations, or consolidating and re-starting discussions.
While you may find good merging options for other aspects of OAS, schema merging is a very, very hard problem for a variety of reasons. Which I should probably write up some time but to be honest after the past year's worth of work on it, the topic just exhausts me. It mostly has to do with JSON Schema validation's nature as a constraint system, in which constraints can be added but not removed. Merging violates the "cannot be removed" principle, which changes the processing model in very difficult ways.
Thinking about that thought experiment more, the dot as a delimiter could complicate things since it could be used in parameter names, meaning we'd have to escape it or pick another delimiter. Might also support removing a field by passing in a null value, but again I wonder how often that happens in practice?
Also, a parameter name
is not a unique identifier, the in
value is also required.
Good point. I'd be ok with being lazy and picking first match unless you specify, but I get that that might impart a queasy feeling.
Might also support removing a field by passing in a null value, but again I wonder how often that happens in practice?
Based on my experience and other people's experience witnessed with RAML, the lack of ability to remove things using overlays has not been an issue. We've also been promoting "adding" over "removing" as a general best practice -- to keep things "safe" -- but I am not sure how this fits in the OAS way of doing.
Two other questions:
extends
? That way you can re-use an overlay on another spec.Two other use cases brought up by @darrelmiller on the 09/14 TSC call:
@whitlockjc (and others) talked about moving the reference into the core document, an extendedBy
approach that might even be typed.
@darrelmiller and @webron observed differences of overlays as mutate vs separating out information like codegen-required annotations.
I personally see this features as $ref
with mutation (add/change) capability. If I am right, I wonder if we're implementing this properly. My concerns are as follows:
I personally would like to see a $merge
or something similar to $ref
.
I think a valid point was raised in the call about if there is value in this being supported by the spec. Such value comes when cross-tooling support is beneficial.
To compare, here's two different flows:
I'm not seeing a meaningful feature gap in one flow or the other (someone please point out if I've missed some tooling type which is more relevant). Overlays sound like they are just for the benefit of the spec writer, not the spec readers (tooling or human), so I don't know that they make sense to be used beyond the spec writing stage. If this is true, I don't think they make sense in the specification.
Related, someone called out the parallels between overlays and $ref. I'd say there are two parts to this: (a) When $ref refer within a file, I'd say they're an optimization and organization of that spec, allowing reduction of size and to relate similar structures. (b) When $ref refer outside the current file, I'd say they're more like overlays and I'd personally argue that is less valuable for the same reasons I gave about overlays. (Some of the tooling I work on can't even support 'b' because of loss of context; the only flow that makes sense in some cases is the "no external file" approach after the editing phase.)
@whitlockjc in the context of JSON Schema, $merge
was highly problematic. After five years, 10+ GitHub issues across two repositories, and well over 500 comments, we decisively rejected the proposal.
Instead, we looked at the use cases that people claimed $merge
solved and came up with several alternative approaches that produce a more consistent and manageable system.
Given a single schema object in JSON Schema, you can, in most cases, easily reason about the behavior based on just that object. With some of the features introduced to solve some of the $merge
use cases, there is a well-defined way to reason about behavior based on that object and its subschemas (including subschemas referenced with $ref
). The overall behavior is a predictable combination of the behaviors of the individual schema objects involved.
The problem with $merge
is that you cannot express the result of evaluating the merged schema in terms of the results of evaluating the two schemas being merged. This makes writing tools much more difficult, particularly when you get into complicated lazy evaluation situations with $ref
.
I can go into this more, although honestly I'd rather not. We've spent tremendous amounts of time and energy on this. If you want to allow merging OpenAPI documents in some way, sure, go ahead. There are places where that would probably make a lot of sense. But JSON Schema will never implement $merge
. That topic has been more heavily debated by more people across more years than any other concept in JSON Schema, and it is not going to happen.
Instead, we reworked how we think about $ref
(in the context of JSON Schema- outside of JSON Schema feel free to do something else), so that it is defined in terms of the results of the referenced schema, rather than some sort of inclusion / replacement. This makes it behave like all other JSON Schema keywords- it produces results, which are combined in the normal way with the other keywords in the same object. And yes, in this approach, other keywords can be adjacent to $ref
.
Let me start by saying that I really like the idea of overlays. There are a number of use cases I can see for overlays in our current tooling and work flows.
However, I don’t currently see any need to formalize a particular style of overlay in the OAS. The current standards for JSON merge (7396) and JSON patch (6902) provide two well-defined and understood means of applying an "overlay" to a base OAS document. This also allows flexibility for applying multiple levels of overlays, optional overlays, generic overlays, etc.
Since JSON merge and JSON patch are both "standards", I don’t see any reason why tooling would have any reservations in using either one or even both of these approaches to construct or consume overlays.
I think we should explore using existing standards-based methods for implementing overlays and identify shortcomings in them, if any, before we push forward to define overlay support in the OAS.
@mkistler simply using JSON Merge and/or JSON Patch as media types, where you apply all patches and then treat it as a regular OAS file, is certainly a plausible approach.
It is different from the JSON Schema $merge
/$patch
proposals in that those were lazily evaluated in a complex way during schema processing, which is what caused a lot of problems. If you restrict it to a pre-processing step at a file level, you avoid those problems as far as JSON Schema implementations go, because by the time a JSON Schema tool is processing the file, it can no longer tell whether it was patched or not. It's just a file.
In reviewing this thread, the main issue that stands out to me is we have two distinct intents for this feature we are calling "overlays":
1) It allows us to annotate an existing OpenAPI API description with information that would unnecessarily pollute the original spec. That information may be only useful for certain tooling. It may be information that is sensitive in nature so may have different permissions. The end result is that each overlay creates a variant of the OpenAPI description with some scenario specific metadata that is not in the core description. e.g. Code generation, SLAs, design documentation, mocking info, backend implementation details.
2) Divide the API description into layers, to enable different users to author different layers of the spec at different points in the API design lifecycle.
3) Overlays can be used to change a core description to satisfy a distinct API use case. e.g Localization, target deployment environments, ...
We need to decide if we are trying to satisfy both classes of scenarios and if so, should we use the same mechanism?
@darrelmiller, I agree that there seem to be very different use cases, or categories of use cases, envisioned here.
I'm not sure whether your breakdown into these two categories is the way to separate them, but maybe I'm not understanding completely.
Sometimes the need to separate concerns is because we don't want to pollute the original spec. Other times, it's because there are different people, processes and/or timelines that address those concerns.
The use case I have in mind is where the original spec is extracted from a code-first process, while human-readable content is contributed separately, as an overlay. The developer may (or may not) annotate source code, and an automated process extracts the spec at compile-time or at runtime. The human-readable description
properties, along with some other nuances like format
, are written by someone else, using a different tool, possibly starting at a different stage in the development lifecycle.
So I'm talking about a category of use cases where:
description
property that is being overridden by a description property in the overlay. (The understanding is that the base document won't have descriptions, only the overlay will.)type
or in
that should be specified only by the base document. I can't tell if my use case falls in your category 1 or 2.
Or maybe it's a category 3.
Or maybe it suggests an entirely different way of categorizing use cases.
An extends object proposed by @MikeRalphson
openapi: 3.1.0
extends:
target: https://raw.githubusercontent.com/OAI/OpenAPI-Specification/master/examples/v3.0/petstore.yaml
overlayMethod: annotation
servers:
- url: http://localhost/v1
I believe the YAML-like but path-supporting object representation I was thinking of was HOCON, as mentioned in OAI/OpenAPI-Specification#1314 - as was mentioned there, there is no reason tools cannot support this format, and that would presumably apply to overlay documents too.
Following up on the TSC discussion from last week: in that meeting it was stated (paraphrasing here):
Overlays should have no impact to tooling outside of editors and parsers.
At the time this was stated, it seemed "clearly true" to me and an important consideration in how we should view this proposal.
But today I am working on my own tooling and struggling with the fact that the spec allows "$ref" just about everywhere, and I need to keep checking for this. But why is that? Should it not be the case that "$ref", just like overlays, should only affect editors and parsers?
What we found for "$ref" is that there is actually important meaning in the fact that a schema (and maybe other things) are "refs" rather than defined inline. In particular, it means that the schema is shared and has a name -- a name that can be used as a classname in generated code. So because of this, our tooling (which is based on swagger-codegen ... which works the same way) does not resolve all refs in the parsing phase.
In light of this, I think we should consider more carefully if overlays may also carry some inherent meaning that tooling outside of editors and parsers may want to access.
I just heard about overlays at APIdays/Paris today.
I agree with the need for solution to ease authoring of specs, especially huge ones, or editing forests of specs that must share partial content or take a full spec and override just some parts. In fact I had that need almost 3 years ago when editing two huge specs with 60% shared content (one of them is still available after the death of the company), but also to easily generate Swagger 2.0 specs variations for different environments (localhost/staging/preprod/prod).
But I disagree with the current approach that consists in adding those authoring features to OpenAPI itself: they will only benefit to the authoring side of the ecosystem, but make the life harder for tools authors of the whole ecosystem (including the OpenAPI consumer side that in my opinion has the most potential for expansion).
I have instead designed my own openapi-preprocessor tool that authors/editors of specs can use to output current OpenAPI specs easier thanks to enhanced tooling with the aim of producing a single spec file (without external references) easy to consume.
$ref
pointing to external files are inlined (in order to provide a single file on output)$inline
allows to inject a copy of another part of a document in place$merge
allows to copy a node, overriding some keysBoth $inline
and $merge
have support for local overriding, with different semantics for override keys ($inline
allows deep overrides targeted with JSON pointer, while $merge
allows override only at the base level).
My choice of the $
prefix is to stay in line with $ref
: those keywords are processing features, but not core of the OpenAPI data model. The $
allows to clearly signal that something special will happen in this area of the file.
I have rewritten the implementation in Go last summer (after months of polishing my implementation of JSON Pointer in Go) and the testsuite already has a good coverage of the implementation. https://github.com/dolmen-go/openapi-preprocessor/
I agree with @darrelmiller 's comment about having multiple categories of use cases. A question I have is actually whether the Overlay specification generates a new API altogether or is it the original API?
If the Overlay method is used for internationalization purposes, then which Overlay document (or the original non-Overlay document) should another API spec reference when they want to extend the API?
I stumbled upon an interesting approach for referencing inner array elements using JSON pointers in the Range Patch draft.
Specifically:
JSON pointer provides capabilities to identify a single element of a data structure. Here we extend it to allow a range of elements for arrays and strings, by extending the scheme how reference token modifies which value is referenced from Section 4 of [RFC6901] (...)
more here.
@darrelmiller @lornajane should this move to the Overlays repo?
In a separate document.
Has a separate lifecycle.
How does the overlay reference the source document? - By document structure Is OperationId a valid way of referencing core documents?