OAI / OpenAPI-Specification

The OpenAPI Specification Repository
https://openapis.org
Apache License 2.0
28.86k stars 9.07k forks source link

Support for polymorphic types is unclear in the spec #403

Closed daveismith closed 4 months ago

daveismith commented 9 years ago

While trying to work through apigee-127/swagger-tools#241 it became clear that the spec is not clear on how polymorphic types are intended to work. @webron will add more details.

webron commented 9 years ago

In 2.0 we changed how inheritance is declared, but it seems that it is not well defined.

Take the following example from the spec:

{
  "definitions": {
    "Pet": {
      "discriminator": "petType",
      "properties": {
        "name": {
          "type": "string"
        },
        "petType": {
          "type": "string"
        }
      },
      "required": [
        "name",
        "petType"
      ]
    }
  },
  "Cat": {
    "description": "A representation of a cat",
    "allOf": [
      {
        "$ref": "#/definitions/Pet"
      },
      {
        "properties": {
          "huntingSkill": {
            "type": "string",
            "description": "The measured skill for hunting",
            "default": "lazy",
            "enum": [
              "clueless",
              "lazy",
              "adventurous",
              "aggressive"
            ]
          }
        },
        "required": [
          "huntingSkill"
        ]
      }
    ]
  },
  "Dog": {
    "description": "A representation of a dog",
    "allOf": [
      {
        "$ref": "#/definitions/Pet"
      },
      {
        "properties": {
          "packSize": {
            "type": "integer",
            "format": "int32",
            "description": "the size of the pack the dog is from",
            "default": 0,
            "minimum": 0
          }
        },
        "required": [
          "packSize"
        ]
      }
    ]
  }
}

The Pet defines a discriminator field, petType. The discriminator field is meant to help some strongly typed languages identify how to map the input to the right class.

The problem is that for the discriminator field to work, we need to know what value is assigned to that field for each instance of its inheritor. This part is currently not well defined in the spec or the samples.

I can think of four alternatives to solve it:

  1. The value of the discriminator can be implicit by the model definition name under definitions. There are two problems here - this will not work with inline-defined models (though may not be relevant), and may not work with some remotely referenced models as well. This also forces the model name to be the type which is not always desirable.
  2. The value of the discriminator needs to be explicitly overridden in the inheritor's definition. This specific option is ugly, verbose and error-prone, but allows the flexibility. Again, if not declared, the fallback could still be option 1 (as in, the model name as defined). (example below)
  3. Like the first option, use the title property of the schema. The problem is that the usage of title in general is not well defined in the spec and this may end up being abuse of actual intent.
  4. The last option is to add a new field, like we added the discriminator, such as class (or whatever), and its value would determine the value, in case the user doesn't want to assume the model name used under definitions. While this is a more elegant solution, it is technically a breaking change in the spec (even though it only adds, and doesn't mutate/remove from it). (example below)

Example for option 2:

Dog:
  description: A representation of a dog
  allOf:
  - $ref: '#/definitions/Pet'
  - properties:
      petType:
        type: string
        enum:
          - dog
      packSize:
        type: integer
        format: int32
        description: the size of the pack the dog is from
        default: 0
        minimum: 0
    required:
    - packSize

Example for option 4:

Dog:
  description: A representation of a dog
  class: dog
  allOf:
  - $ref: '#/definitions/Pet'
  - properties:
      packSize:
        type: integer
        format: int32
        description: the size of the pack the dog is from
        default: 0
        minimum: 0
    required:
    - packSize

Other considerations to take into account:

casualjim commented 9 years ago

I want to caution against doing abstract classes. That takes a very language specific view of the world. Not all languages have classes or inheritance. Most of them have polymorphism but classes and especially hierarchies with abstractness etc are only available in a small subset of languages. Java does happen to be one of them.

~~I think inheritance and the structuring thereof is well outside the realm of describing remote API's. Even the discriminator field is questionable to have in the spec, this seems to be an application specific something and not really the API.~~

languages that don't support it can just ignore it

toomuchpete commented 9 years ago

One issue with the examples above is that it doesn't allow multiple values to map onto the same subclass, which could lead to unnecessary duplication within the swagger file. A potential 5th option would be to define the mapping on the parent class, so something like this:

{
  "definitions": {
    "Pet": {
      "discriminator": "petType",
      "discriminatorMap": {
        "cat": {"$ref": "#/definitions/Cat"},
        "dog": {"$ref": "#/definitions/Dog"}
      },
      "properties": {
        "name": {
          "type": "string"
        },
        "petType": {
          "type": "string"
        }
      },
      "required": ["name","petType"]
    }
}

This also has the advantage of being able to construct the entire inheritance tree from the same place where you see the discriminator defined, rather than having to search all defined schemas to see which of them inherit from the parent class.

Also, w/r/t option 4, I find class to be a fairly unintuitive name for that value. It seems like it could be something more straightforward like discriminatorValue

dolmen commented 9 years ago

There is a 6th option (completely compatible with the existing JSON schema): using enum constraints for matching. That would allow to separate the model name from the discriminator property value. In the example below, the petType is in lowercase while the model name is capitalized.

    "Cat": {
          ...
          properties: {
            "petType": {
              "type": "string",
              "enum": [
                 "cat"
              ]
            },
          ...
        },

Full document:

{
  "definitions": {
    "Pet": {
      "discriminator": "petType",
      "properties": {
        "name": {
          "type": "string"
        },
        "petType": {
          "type": "string"
        }
      },
      "required": [
        "name",
        "petType"
      ]
    },
    "Cat": {
      "description": "A representation of a cat",
      "allOf": [
        {
          "$ref": "#/definitions/Pet"
        },
        {
          "properties": {
            "petType": {
              "type": "string",
              "enum": [
                 "cat"
              ]
            },
            "huntingSkill": {
              "type": "string",
              "description": "The measured skill for hunting",
              "default": "lazy",
              "enum": [
                "clueless",
                "lazy",
                "adventurous",
                "aggressive"
              ]
            }
          },
          "required": [
            "huntingSkill"
          ]
        }
      ]
    },
    "Dog": {
      "description": "A representation of a dog",
      "allOf": [
        {
          "$ref": "#/definitions/Pet"
        },
        {
          "properties": {
            "petType": {
              "type": "string",
              "enum": [
                 "dog"
              ]
            },
            "packSize": {
              "type": "integer",
              "format": "int32",
              "description": "the size of the pack the dog is from",
              "default": 0,
              "minimum": 0
            }
          },
          "required": [
            "packSize"
          ]
        }
      ]
    }
  }
}
webron commented 9 years ago

@dolmen - that's option 2...

dhoelle commented 8 years ago

From the perspective of a polymorphic API designer (not a tool builder), I would prefer options 1 & 4. That is, I would prefer that the discriminator is implicitly considered to be the model definition name as defined under definitions (perhaps lowercase'd), unless an explicit class (or discriminatorValue or whatever) is defined.

That said, I'm not sure I understand @toomuchpete's critique. @toomuchpete: can you give an example where multiple values would map onto the same subclass, requiring unnecessary duplication?

@webron: how do you see proceeding with this issue? I would love to define my polymorphic API with Swagger!

webron commented 8 years ago

We may implement support for option 4 in a non-breaking way by using a vendor extension x-class, however I can't give more details as to if/when it would happen.

dhoelle commented 8 years ago

It seems to me that option 4 is an extension of option 1. If option 4 is the decided goal, and option 1 is non-breaking, could we declare option 1 to be the currently supported option (that doesn't support inline-defined models and remotely referenced models), then add support for additional use-cases through the addition of x-class later?

Serious apologies if this is a noob comment and I am misunderstanding the effects of the change.

webron commented 8 years ago

In a way, I believe we have to assume option 1 anyways for existing implementations.

toomuchpete commented 8 years ago

With respect to APIs not yet created but that are dedicated to using Swagger, this discussion mostly boils down to the user-friendliness of Swagger itself. The APIs can be crafted around whatever Swagger's limitations are.

However, if the intent is that Swagger be used to describe a large set of existing APIs, care needs to be taken not to make an API impossible to define in Swagger.

If used by itself, option 1 above does this. Consider a simple case where two different types of objects in the same API are polymorphic and their discriminator values overlap. In a small API this might be unlikely, but in larger APIs, fields like "type" and "status" can often have very repetitive values even across different objects. Option 1 requires the overlapping child definitions be located at the same location in the Swagger. It's not particularly suitable for many existing, real-world APIs.

To address @dhoelle's question about Option 2: I didn't fully flesh out this thought in my comment above. It definitely solves the problem of mapping multiple values to a single definition. My concern with option 2 is that it raises all of the issues that JSON Schema's "oneOf" with the added bonus of needing to access every definition in the Swagger document to determine the result and it leaves undefined the result of multiple definitions that match a particular discriminator value. This is already a bit of an issue with using "allOf" for inheritance, since there's no way know which of the schemas in that array is the real "parent".

If our goal is to address the lack of clarity in the spec without making any breaking changes, then I suppose some combination of options 1 and 2 are the ones that we'd have to choose. For that, I like option 2 without the option 1 fallback. IMO, if the value of the discriminator isn't found in a subclass's definition, the parent class definition should be used by itself. That reduces the possibility of weird and potentially dangerous bugs with data values being able to link into other places within the document.

Both option 1 and option 2 do have serious deficiencies, though, and I'd love to see a v2.1 even if all that does is address the polymorphism complexity. If not, I hope this topic will be revisited in whatever the next version of Swagger is.

An explicit mapping would make the author's intent clear, it would make implementation of parsers and other systems far less complex, and it allow the system to remain flexible. There are definitely other reasonable approaches, though.

For now, I'll probably need to use some custom extensions for an active project. Once I have a swagger doc I can share publicly I'll post a link here.

arnested commented 8 years ago

Hi,

@wing328 pointed me to this issue. I haven't read through all the comments yet and not sure I understand all the edge cases yet either.

I just wanted to point you towards some work I did on implementing polymorphism in the PHP client: https://github.com/swagger-api/swagger-codegen/compare/master...arnested:php-polymorphism-client (please note that it requires a snapshot release of swagger-parser because of https://github.com/swagger-api/swagger-parser/pull/128).

As I wrote I haven't read/understood every aspect of this yet, but for my (possibly limited) use case the above fork/branch solves our problem of having an API endpoint return a specific class (with a discriminator property defined) and all subclass inheriting from it (through allOf).

I'll try to look closer in to this issue to see if I have something valuable to add. I might be limited in time by a work deadline.

wing328 commented 8 years ago

FYI. Swagger parser 1.0.12 has been released several hours ago: https://github.com/swagger-api/swagger-parser/releases

arnested commented 8 years ago

Thank you for the hint, @wing328. I've updated my fork/branch.

webron commented 8 years ago

Tackling PR: #741

lafrech commented 7 years ago

From what I understand, inheritance relation is defined by allOf: the fact that Cat has a allOf field with a ref to Pet is interpreted as "Cat inherits from Pet" and this is the one and only way to represent inheritance in the spec. Then, in case of polymorphism, discriminator points to the right schema.

I might be missing something, but what about a case where the child not only adds but also overrides properties? I don't have a pet-store-simple example. While overriding methods is frequent in OOP, overriding attributes is less common in my limited experience but legitimate AFAIK.

In this case, Child still inherits from Parent but the spec can't read

  "definitions": {
    "Child": {
      "type": "object",
      "allOf": [
        {
          "$ref": "#/definitions/Parent"
        }
        [...]

because some attributes from Parent are redefined (different type, validators (range,...), properties (readOnly),...) so Child fields can't validate against Parent schema.

In this case, I don't know how to represent inheritance.

The easy solution that comes to mind is to use a x-inherits or x-childOf attribute to define inheritance when allOf can't.

Is the use case I'm presenting illegitimate? Is there another way to deal with it already?

FWIW, related post on SO: http://stackoverflow.com/a/42474248/4653485.

(With a inherits/childOf attribute introduced, I'd be tempted to go even further and rely solely upon it to define inheritance, in which case allOf would just be a tool to avoid redundancy, with no semantic meaning. After all, I can also imagine a class having all the attributes from another one without any inheritance link. Currently, I can't use allOf there to cut redundancy because there is no actual parent/child relationship between the two classes. Obviously, this would be a totally breaking change, probably not justified by the limited potential benefit.)

ePaul commented 7 years ago

@lafrech allOf in JSON schema means that the object to be validated must be valid for all of the listed schemas. So you can't have an object of a sub type which is not also a valid object of the parent type. If you have properties with same name but different type in both, there will be no objects matching both schemas, so it is basically an empty type. If you have validations, they simply are merged (i.e. all of them must apply).

lafrech commented 7 years ago

Thanks @ePaul for your comment.

I totally understand that. It means that

daniil-dubin commented 7 years ago

So far as follows from here https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#composition-and-inheritance-polymorphism only a half of "variant 2" has been included into spec, the one which demands coincidence between discriminator value and the definition's field name. Which requires discriminator value to be unique among all the definitions and is highly impractical as @toomuchpete said. Also it doesn't allow you to derive schema hierarchy from the swagger definition, unless you make an assumption that if schema has a single reference in allOf section, then this schema is sub-type of one that is referenced. Which is a weak assumption because allOf was primarily designed for composition not specifying hierarchies.

Would it be legitimate for me to take the approach of specifying discriminator values via enums instead? And what about allOf and field overriding? Is it allowed? From the first glance looks like it isn't.

brocoli commented 7 years ago

+1 for using enum to specify the child's discriminator value. Without this we're sort of leaking implementation details when we require the string value of the discriminator be the same as the "nice name" of a definition.

As for using the enum in the parent class... well:

If we include this information in the parent class, then developing a child class requires modifying the parent class. However, if the information is only in the child class, then you don't have control over which classes are acceptable -- simply adding a new definition can effectively change the API of multiple paths.

I'm starting to think the root of this problem is that we can't specify parameters as anyOf. If we could, then there would be no need for discriminator for many applications. You could just declare the "child classes" with allOf, and then list the ones you want to accept in the various paths of your API.

Let's say Cat, Dog and HoneyBee are Pets, but because of regulations due to the decline in bee populations, we have to provide separate endpoints for HoneyBees now (let's say we want our gateway to load balance differently for them). If we use anyOf in parameter definitions, then we only have to remove the HoneyBee entry from our existing paths. If we use something akin to discriminator instead, then we have to make it so HoneyBee is not a Pet anymore, which doesn't make much sense.

On top of the anyOf solution we could add support for discriminator for reifying that type information must be passed by the clients in requests. All we would need is a string in the "parent" class specifying what's the name of the discriminator parameter, and then in the "child classes" a definition of this parameter is required, and it has to have an enum entry with at least one value. If discriminator definitions are not compatible, or if two classes define the same enum value for their discriminator, then it's an error in the API.

You 'd need this discriminator if you want to generate code in a language without much support for polymorphism or union types, but for those languages that don't need it, you don't necessarily need the discriminator. This would help a lot with documenting existing APIs.

fehguy commented 7 years ago

Please take a look at https://github.com/OAI/OpenAPI-Specification/pull/1093

debri174 commented 6 years ago

I'm looking for docs tooling that displays polymorphism either as defined in version 2 spec or version 3 spec.

In my research the only docs tooling that handles and displays polymorphism is Swagger UI using the v3 spec.

The other docs tools that I've come across only support version 2 of the spec but I can't seem to find any that handle polymorphism as defined in the v2 spec.

Any help is appreciated!

MikeRalphson commented 6 years ago

In addition to swagger-ui, Rebilly ReDoc supports discriminator based polymorphism, and recent alphas support OpenAPI 3.0.x

raderio commented 6 years ago

https://github.com/Rebilly/ReDoc/blob/master/docs/images/discriminator-demo.gif

sebastien-rosset commented 4 years ago

Related to this ambiguity, the spec is not very clear about how to represent lists of polymorphic types. For example, suppose the "Pet Store" API can return a list of pets:

store:
  properties:
    pets:
      description: The collection of pets in this store
      type: array
      items:
        - $ref: '#/components/schemas/Pet'

The intent of the API designer could be one of:

  1. Return a non-polymorphic list of pets, i.e. just the pets, not dogs and cats
  2. Return a polymorphic list of pets, where each item in the array is a "subclass" of pet, i.e. list of dogs, cats, possibly "pet"

How do you capture this intent in a OpenAPI spec? How do you specify any "subtype of"? For 2), the expectations are not very clear. The API designer may have the following approach in their mind:

  1. The "pets" property is a list of '#/components/schemas/Pet',
  2. The Pet schema specifies "petType" as a discriminator,
  3. The API server returns a heterogeneous list of pets, where each "petType" in the list is the value of the specialized type.

With this approach, the API designer might expect the client to parse the value of the discrimanator, resolve the type from the OpenAPI spec, and instantiate the proper implementation type, for each item in the array.

Alternatively, "oneOf" (as shown below) is one way to explicitly list all possible types, but it gets unwieldy when there are many subtypes. In addition, it seems the tooling designer would not be able to leverage polymorphic features from programming languages, such as interfaces and classes.

store:
  properties:
    pets:
      description: The collection of pets in this store
      type: array
      items:
        oneOf:
          - $ref: '#/components/schemas/Pet'
          - $ref: '#/components/schemas/Dog'
          - $ref: '#/components/schemas/Cat'
Yingsheng-eroad commented 4 years ago

Can I use a defined enum property as discriminator? "PetType": { "type": "string", "enum": [ "CAT", "DOG" ] }

handrews commented 4 months ago

This was fixed by a PR in 2017, so I'm closing it. To answer the three subsequent questions: