OAI / OpenAPI-Specification

The OpenAPI Specification Repository
https://openapis.org
Apache License 2.0
29.07k stars 9.07k forks source link

Extensible enumerations (growable lists) #1552

Open JonKohler opened 6 years ago

JonKohler commented 6 years ago

Hey all, I did some searching in the OAI repo, and it didn't jump out at me as an existing feature request.

The issues with enum being "non-growable" without making a major semver change for an API have been talked about in several places. Zalando came up with nice framework within their API guidelines to handle this, which would be incredibly useful to upstream into the specification itself

https://zalando.github.io/restful-api-guidelines/#112

Should: Used Open-Ended List of Values (x-extensible-enum) Instead of Enumerations [112] Enumerations are per definition closed sets of values, that are assumed to be complete and not intended for extension. This closed principle of enumerations imposes compatibility issues when an enumeration must be extended. To avoid these issues, we strongly recommend to use an open-ended list of values instead of an enumeration unless:

the API has full control of the enumeration values, i.e. the list of values does not depend on any external tool or interface, and

the list of value is complete with respect to any thinkable and unthinkable future feature.

To specify an open-ended list of values use the marker x-extensible-enum as follows:

deliver_methods:
  type: string
  x-extensible-enum:
    - parcel
    - letter
    - email
MikeRalphson commented 6 years ago

In OpenAPI 3.x does this have any advantages (apart from perhaps readability) over the more JSON-schema recipe:

deliver_methods:
  type: string
  anyOf:
  - enum:
    - parcel
    - letter
    - email
  - {}

? Thanks for citing the Zalando extension, BTW, that's useful and interesting.

cmheazel commented 6 years ago

I'll vote for extensible enumerations. However, it would be useful if the enumerated values were resolvable. This allows a client to "discover" the meaning of an unknown value. If the client is semantically aware, and the value resolves into an ontology, then the client may even be able to process the value based on its' advertised semantics.

JonKohler commented 6 years ago

bump!

JonKohler commented 6 years ago

Alternatively, the proposal set out my the Microsoft Azure team, here: https://github.com/Azure/autorest/blob/master/docs/extensions/readme.md#x-ms-enum

IMHO is pretty good, and gives even more flexibility:


accountType:
  type: string
  enum:
  - Standard_LRS
  - Standard_ZRS
  - Standard_GRS
  - Standard_RAGRS
  - Premium_LRS
  x-ms-enum:
    name: AccountType
    modelAsString: false
    values:
    - value: Standard_LRS
      description: Locally redundant storage.
      name: StandardLocalRedundancy
    - value: Standard_ZRS
      description: Zone-redundant storage.
    - value: Standard_GRS
      name: StandardGeoRedundancy
    - value: Standard_RAGRS
    - value: Premium_LRS
JonKohler commented 6 years ago

I've reached out to the Zalando team to see if we could parlay a single definition, and perhaps help the openapi community model out a proposal for upstream, here: https://github.com/zalando/restful-api-guidelines/issues/412

MikeRalphson commented 6 years ago

@JonKohler from the documentation you linked to (emphasis mine):

For this reason, enums are not automatically turned into strongly typed enum types - instead they are rendered in the documentation comments for the property or parameter to indicate allowed values. To indicate that an enum will rarely change and that C#/Java enum semantics are desired, use the x-ms-enum exension.

Which seems to be the opposite way around to the intent of this issue? It seems the problem here is lack of consistency and flexibility in two or more code-generation tools and (enum value descriptions notwithstanding) not an issue with the spec itself.

JonKohler commented 6 years ago

Thanks for the sanity check @MikeRalphson - Let me touch base with the Azure team to see if we can get some consensus on this.

In short, if we boil this back to the need to have both extensible enums as well as having some metadata about those enums (description, etc), then that is a good thing to focus on for some modification to the upstream spec. Is that fair to say? or is my understanding off course?

JonKohler commented 6 years ago

@MikeRalphson - I've posed this to the Azure team, here: https://github.com/Azure/autorest/issues/2950

MikeRalphson commented 6 years ago

In short, if we boil this back to the need to have both extensible enums as well as having some metadata about those enums (description, etc), then that is a good thing to focus on for some modification to the upstream spec. Is that fair to say? or is my understanding off course?

There is obviously some demand to enhance enums in the ways you mention. Making them extensible (without using a registry approach) can be done using the anyOf syntax above. Tooling support for it would be up to individual vendors. Everything else will be considered as with any proposal for the spec, with the usual 'no promises' caveat.

JonKohler commented 6 years ago

Thanks @MikeRalphson ... since this is my first interaction with doing an upstream proposal for the spec, im happy to spearhead this.

Is there any guidance on how to formally propose something to the spec? I checked out the readme on this repo, and it seems like making a thread is the way to go (here we are), but I wanted to see if perhaps there is a more structured way, to make your life and everyone elses easier? I'd hate for this good conversation to get locked up in a long GH thread

MikeRalphson commented 6 years ago

@JonKohler the short answer is that the spec. is just markdown and anyone can, and is very welcome to, PR changes against it. Though a PR could just be a 'straw-man', judging when an issue has enough consensus behind it to take it to a PR is definitely an art rather than a science.

A longer answer is that this is something I personally feel we struggle with today, partly because we don't want to waste anyone's time on a change which may have to undergo several revisions and still possibly not make it in, and also because if the change would need to target a minor or major semver version of the spec (as any change apart from wording/examples etc would) then we don't currently have a 3.1.0.md or 4.0.0.md branch/file to target the PR against. This is due to a number of factors including human bandwidth and not yet having tools to ensure patch changes always flow into the next minor version and minor changes flow into the next major version (effectively we would need to cherry-pick commits across branches but apply them to different files). Also creating a (say) 3.1.0 version of the spec would inevitably lead to questions about when it would be released, which are not always easy to answer.

That said, one really good way to contribute is to participate in the (usually) weekly TSC web calls. Agendas are created as GitHub issues and details for joining are in the README.

cmheazel commented 6 years ago

@JonKohler Take a look at the Draft Features section of Development.md. This is a new process for test-driving proposed features prior to making them a formal part of the spec.

handrews commented 6 years ago

FYI, for value / title or description pairs the official JSON Schema recommendation (using draft-06 syntax, replace const with single-element enum for draft-04) is:

oneOf:
  - const: foo
    title: The Foo Thing
  - const: bar
    title: The Bar Thing
    description: There's some explanation about the bar thing

An extensibility proposal based on this will fit well with newer drafts of JSON Schema. The oneOf + const + only title or similar annotations (there is a formal notion of "annotation" in newer drafts) is easily detectible and is expected to form a major basis for generative JSON Schema vocabularies (code generation, UI generation, etc.)

adamjones1 commented 4 years ago

@MikeRalphson @handrews If we wanted to combine these two features in the code samples you gave, and have an enum which is both open and has descriptions on its members, the naive way of doing it would be something like:

type: string
anyOf:
  - const: foo
    title: The Foo Thing
  - const: bar
    title: The Bar Thing
    description: There's some explanation about the bar thing
  - {}

right? But unless I'm missing something, wouldn't this not be sufficient because if the value "bar" showed up, we couldn't know whether it's a member of the "The Bar Thing" singleton type (and thus has the semantics of its description) or whether it's just any old string that corresponds to the third case of the anyOf? So the descriptions have no effect? In which case we would actually need to do something like:

type: string
oneOf:
  - const: foo
    title: The Foo Thing
  - const: bar
    title: The Bar Thing
    description: There's some explanation about the bar thing
  - not:
      anyOf: ["foo", "bar"]

If we scale this up with more cases or a more complex base type than just string it does feel like it would get quite unwieldy and harder for parsing tools to detect the 'enum pattern' to me, so if this is all the case then it would be quite nice to have native support for extensible enums in JSON Schema/OpenAPI short-handing it.

(You could remove duplication in OpenAPI at least by defining the closed version of the enum as a schema and then having the overall schema reference it in an "either this or not this" oneOf, but eh I dunno it feels like leakage to me having that closed version centrally defined?)

handrews commented 4 years ago

@adamjones1 I would use an anyOf since the const values ensure that it behaves mostly like a oneOf. At that point, you just need to figure out if the instance only matched the empty schema or if it matched one of the const schemas as well.

OAS 3.1 will update to the latest draft of JSON Schema which supports modular extension vocabularies (this is a totally new thing since my last comment from 2018). In the time since the draft of JSON Schema that was used for OAS 3.0, we have improved support for annotation keywords (e.g. title and description) by specifying an output format for collecting the values of such keywords that apply to each instance location, and where those values came from in the schema.

So you could rely on matching title, but an even better option would be a keyword such as extendedEnumPlaceholder or something like that (I just made up that name, don't take it too seriously). This would be used something like:

type: string
anyOf:
  - const: foo
    title: The Foo Thing
    extendedEnumPlaceholder: false
  - const: bar
    title: The Bar Thing
    description: There's some explanation about the bar thing
    extendedEnumPlaceholder: false
  - extendedEnumPlaceholder: true

If you see a false value in your annotation results, it matched a pre-defined option. If you only see a true value, it only matched the otherwise-empty schema option.

himanshumps commented 4 years ago

Any update on on how to do it in openapi 3.0. Is this the way to go for extensible enum in opanAPI 3.0

        type:
          type: string
          anyOf:
          - enum: 
            - PRODUCT_ARRANGEMENT
            - OFFERED_ARRANGEMENT
          - {}
sepanniemi commented 2 years ago

Any decisions on this?

DXist commented 2 years ago

Dummy string variant can be modeled as

"UnknownStringVariant": {
      "readOnly": true,
      "type": "string"
    }

Client should support this variant to maintain forward compatibility. Server should not allow this variant on deserialization / semantic validation layer.

And an enum can be represented as an alternative between two subtypes

"TheValues": {
      "anyOf": [
        {
          "$ref": "#/definitions/KnownTheValues"
        },
        {
          "$ref": "#/definitions/UnknownStringVariant"
        }
      ]
    }
thejeff77 commented 2 years ago

@DXist I'm trying to get this to work.

The tricky thing is that string values through an API are non-breaking unless you have a client that tries to serialize them as a generated enum model and doesn't have a value that was added. So the openapi spec turns generated (and extensible) enums into a breaking api change for anyone using generators that doesn't update clients on patch versions.

We have a spring boot kotlin app and adding this over an enum doesn't really do the trick for the client generators:

data class TheValues(
  @Schema(
      anyOf = [
        MyEnum::class,
        String::class
      ]
  )
  val myEnum: MyEnum
)

The result is just a typical enum generation being done.

The ideal result would generate classes that don't explode when an unknown value is passed in.

Similarly, running the kotlin-spring server generator on this gives a class like so:

/**
 * 
 */
class TheValues(

) {

}

Which isn't the most useful generated code.

Would be pretty nice to have an option for generated code that allows default on unknowns:

eg:

components:
  schemas:
    EnumExample: 
      type: string
      enum: 
        - HELLO
        - WORLD
        - PARSE_FAIL
      defaultOnUnknown: PARSE_FAIL
handrews commented 2 years ago

@thejeff77 this sounds more like a problem with the code generation tool not understand the schema than a problem with the schema.

dpashkevich commented 2 years ago

@DXist I second @handrews comment. You can implement a fallback for unknown enum values received from the API at the client level. This is actually exactly what openapi-generator does. Most implemented languages in openapi-generator support a enumUnknownDefaultCase property, see e.g. https://github.com/OpenAPITools/openapi-generator/blob/master/docs/generators/java.md#config-options

I do agree that it would be nice to have a standard way of defining this fallback in the OpenApi specification itself, vs hard-coding it in a specific client SDK, for tool interoperability's sake.

handrews commented 2 years ago

@dpashkevich this is more a JSON Schema thing than an OpenAPI thing, so such a keyword could be implemented as an extension vocabulary (OpenAPI itself implements a few additional keywords as a JSON Schema extension vocabulary). Tooling vendors would have to adopt the vocabulary, of course, but it would not require an update of either OpenAPI or JSON Schema to do so.

dpashkevich commented 2 years ago

@dpashkevich this is more a JSON Schema thing than an OpenAPI thing, so such a keyword could be implemented as an extension vocabulary (OpenAPI itself implements a few additional keywords as a JSON Schema extension vocabulary). Tooling vendors would have to adopt the vocabulary, of course, but it would not require an update of either OpenAPI or JSON Schema to do so.

Yes, I understand that there's always an option to use a vendor extension, but it would be great to eventually promote it to an official standard, if it's deemed universally useful. This whole github issue discusses a gap or limitation in the OpenAPI/JSON Schema standard in expressing whether an enum can grow in the future or not.

handrews commented 2 years ago

@dpashkevich my point is that people can take action now, and in doing so can demonstrate the usefulness and suitability of a proposal. We don't have to just debate it in purely theoretical terms and wait for the next big release. There have been other keywords that started as extensions, but now there's an actual mechanism for managing extensions and asserting that they are required in order to process a particular schema. That wasn't present before. So experimentation and implementation are a bit easier now.

JSON Schema vocabularies are intended to be more interoperable than simply being vendor-specific extensions. They are identified by URIs and can be recognized that way across implementations, rather than just having an x- prefix and hoping there's not a naming collision.

thejeff77 commented 2 years ago

@handrews I think if I'm reading your point correctly - we can add the support for default-if-unknown in specific client generators to demonstrate the usefulness and suitability of it before debating it as a more general keyword?

If that sounds right (correct me if I'm missing something) - I'm wondering - what needs to happen before this crosses the bounds of "purely theoretical" into common use of an extension that ends up as a global keyword?

Because as @dpashkevich is pointing out, this is already there in most generators. Which to me suggests it's been proven out as useful, combined with this thread about the more global issue of enum declaration needing to be extensible or non-breaking somehow needing a more general solution.

Of the ones I've found (only looked for a little while) are here:

An extensible-enum possibility seems more theoretical per @handrews 's point.. this seems to have more pitfalls around being pragmatic and makes the definition and codegen a bit more complex to swallow. Whereas adding an option to not break API consumers when string values are added to auto-gen'd enums - this prevents client issues more globally, and has been proven out to a larger extent in the community.

Although this all might be in the wrong context because this option is a generator option, not an option that's put in the openapi spec itself...

handrews commented 2 years ago

@thejeff77 someone needs to write up the spec for the keyword(s) involved, assign a URI to the vocabulary, publish it (somewhere, anywhere) so that people who want to use it can declare it in $vocabulary and implementations that want to support it can recognize it when it is so declared. And, of course, people need to implement it- probably best to provide a reference implementation. But there's no formal process that needs to be completed.

You can make a pull request here once you have a vocabulary specification if you want more people to see it.

Bennett-Lynch commented 2 years ago

I think the need for declaring enumUnknownDefaultCase at the spec-level is perhaps a symptom of poor tooling support for handling unknown enum values in general.

Rather than handling an unknown value by defaulting to a predefined constant (often, e.g., UNKNOWN), I argue it would be better to always guarantee the ability to expose the enum as a string, and supplement it with a true enum-based value, which may not be available (i.e., UNKNOWN or null) if the constant isn't recognized. This way consumers can benefit from normal enum semantics without the risk of complete data loss for new or unrecognized values.

For an example of this approach, see how the AWS Java SDK v2 handles exposing an enum-backed property for ChecksumAlgorithm. The generated object has two property accessors:

  1. checksumAlgorithm: ChecksumAlgorithm (enum, which may have a value of UNKNOWN)
  2. checksumAlgorithmAsString: String (guaranteed to always be available)

This way clients get to benefit from enums in the happy case, but still have a way to fetch or supply raw data for an unrecognized value without having to update their software version.

If the OpenAPI tooling took a similar approach, then there wouldn't be a need to define enumUnknownDefaultCase at the spec-level, and instead it would be a tool-based convention to offer a standard way to represent unknown values (supplemented by string accessors).

Lack of this support in OpenAPI has led me to declaring properties that should be enums as strings, and then declaring app-local enum conversions, losing the benefit of a shared spec.

michalgazda commented 1 year ago

defaultOnUnknown

We would love to see such option in openapi specification. We know that there is enumUnknownDefaultCase on client generator side, though firstly it is not working in Java-Spring generator, secondly, it pushes responsibility to client and makes it impossible to make it declared in the specification. We can use null value and nullable: true for the enum as per doc (https://swagger.io/docs/specification/data-models/enums/), however, having notion of unknown value possibility would be better.

Any chance it could be picked up for standard definition?

LasneF commented 1 year ago

😥 here we may miss the user experience , of the API designer as well as the Tooling implementer having enum with a value and a description is a definitive improvement of the documentation for clients

for sure we may handle it via anyOf / oneOf , and const keyword with description , but enum has been created to be a short cut to this complexity ,

same around extensible enum , it s a real world case, that would need to be tackle in an easy way . Then that is may be not a OAI topic rather than a Json Schema

thejeff77 commented 11 months ago

Ok I've been part of this discussion for a while, and am going to write up some specifications per @handrews guidance.

First, I'll summarize our discussions:

Second, I'll propose here first, for further discussion:

Current enum schema definition example looks like the below, supporting a "default" key for when it may be null. As it relates to enums, referencing the fixed-fields definition, "If the enum is defined, the value MUST exist in the enum’s values."

Existing Enum OAPI Definition Example

      "DirectionEnum" : {
        "type" : "string",
        "enum" : [ "FORWARD", "LEFT", "RIGHT", "REVERSE" ]
        "default": "FORWARD"
      }

Proposed Enum OAPI Definition Example

The proposal is to add a key fallback to a Schema-only enum definition. This value would indicate the value the enum will be set to when the case above is breached: when "the value MUST exist in the enum’s values". I.E. when the enum value does not exist in the enum's values, the fallback will be used.

This is meant to solve the tolerant-reader problem, as a value that does not map to an enum causes the parsing to break. This solution will allow the API creator to purposefully opt-in to using a value that exists in the enum array when the value does not exist within that array.

Example:

      "DirectionEnum" : {
        "type" : "string",
        "enum" : [ "FORWARD", "LEFT", "RIGHT", "REVERSE", "NOT_MAPPED" ]
        "default": "FORWARD"
        "fallback": "NOT_MAPPED"
      }

Examples

This pattern has been implemented as a client generator option in many clients today, and has been a long-sought after feature for openapi to support in some standardized way.

Of the ones I've found are here:

https://github.com/OpenAPITools/openapi-generator/issues/10038 java (see the enumUnknownDefaultCase property) https://github.com/finos/symphony-bdk-java/issues/592 typescript (see the enumUnknownDefaultCase property)

Note that Jackson treats the default the same as a fallback in its @JsonEnumDefaultValue annotations indicating the default value, however it seems somewhat unlikely that OAPI community would rather change their definition of default as it pertains to schema than support a new key, however I'm open to feedback about this.

Bennett-Lynch commented 11 months ago

@thejeff77 Excellent write up. My main concern with the proposal is that we're saying it would be too difficult to force a non-spec-based standardization across the few dozen of generators. That's fair and I can imagine the difficulty there. But in turn we're proposing a spec-based convention that thousands of developers will now need to become aware of and retroactively add to their existing specs. I see a few problems with this:

  1. Not every developer is aware of the nuances of extending enums and many will not know they need it until it becomes an issue. And once they are aware, it just becomes repeatable boilerplate needed for every enum.
  2. Clients are dependent upon services to author/update their specs correctly. This means as a consumer, with a generated client, you may be stuck with unsafe enum semantics until the service you're calling agrees to make changes.
  3. It still requires updates from every generator to support this new spec attribute.
  4. In the event there are programming languages with more native support for modeling unknown enums, we're preventing effective utilization of those constructs since we've explicitly extended our enums.
  5. Defining fallback values will likely result in varying and inconsistent conventions (e.g., "NOT_MAPPED" vs "UNKNOWN", etc.). It may also tempt users to fallback to a normal constant which I would argue is probably dangerous in the vast majority of cases.

All things considered, I think a tool-based convention is the cleanest long-term solution here, but I respect the desire to find something more tractable. This remains a severe paint point for anyone who uses enums in their APIs. Could the OpenAPI team perhaps help drive a tool-based convention/improvement here?

thejeff77 commented 11 months ago

@Bennett-Lynch thanks for the feedback.

Unfortunately what I'm hearing here (or not hearing here) is that there is no such thing as a tool-based convention.

If it exists, what is it? How is this done and what is an example of it being done prior?

Any spec improvements will likely be a lot of work and have some of those downsides for sure. So we'd want the right proposal.

As far as OAPI standard json proposals, I can think of two others that may work.. strictDefault: true|false, or perhaps changing the definition of default to be tolerant for schema enums.

As far as preventing bad api design.. personally I believe this is not the OAPI specification's job.

Assuming an english all uppercase UNKNOWN is always the best value here may be presumptuous of us. But maybe not? With a flag like strictDefault: false, the tolerant reader could use the default if the value does not exist in the enum.. then specifying another value would not be required. I can maybe write this up too if anyone likes that better

LasneF commented 11 months ago

@thejeff77 , as part of the discussion there was also a point about having the capability to have for enum a per value description that is currently a pain. it was mentionned leveraging x-ms-enum like