dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.5k stars 10.04k forks source link

OpenApi polymorphic types missing discriminators #57982

Open dnv-kimbell opened 1 month ago

dnv-kimbell commented 1 month ago

Is there an existing issue for this?

Describe the bug

I have some polymorphic types

[JsonPolymorphic(TypeDiscriminatorPropertyName = "$dis")]
[JsonDerivedType(typeof(Cat), typeDiscriminator: "cat")]
[JsonDerivedType(typeof(Dog), typeDiscriminator: "dog")]
public class Pet
{
    public string Name { get; set; } = default!;
}
public class Dog : Pet
{
    public string? Breed { get; set; }
}

public class Cat : Pet
{
    public int? Lives { get; set; }
}

This result in

"Pet": {
    "type": "object",
    "anyOf": [
        {
            "$ref": "#/components/schemas/PetCat"
        },
        {
            "$ref": "#/components/schemas/PetDog"
        },
        {
            "$ref": "#/components/schemas/PetBase"
        }
    ]
},
"PetBase": {
    "properties": {
        "name": {
            "type": "string"
        }
    }
},
"PetDog": {
    "required": [
        "$dis"
    ],
    "properties": {
        "$dis": {
            "enum": [
                "dog"
            ],
            "type": "string"
        },
        "breed": {
            "type": "string",
            "nullable": true
        },
        "name": {
            "type": "string"
        }
    }
},

This is the equivalent generated by Swashbuckle

"Pet": {
  "required": [
    "$dis"
  ],
  "type": "object",
  "properties": {
    "$dis": {
      "type": "string"
    },
    "name": {
      "type": "string",
      "nullable": true
    }
  },
  "additionalProperties": false,
  "discriminator": {
    "propertyName": "$dis",
    "mapping": {
      "dog": "#/components/schemas/Dog",
      "cat": "#/components/schemas/Cat"
    }
  }
},
"Dog": {
  "allOf": [
    {
      "$ref": "#/components/schemas/Pet"
    },
    {
      "type": "object",
      "properties": {
        "breed": {
          "type": "string",
          "nullable": true
        }
      },
      "additionalProperties": false
    }
  ]
},

Expected Behavior

The type Dog is renamed to PetDog; not expected.

Pet has the name property defined in code, but this shows up on PetBase and PetDoc in schema. I can't see how you can write code that operates on the Pet type and use the Name property.

Swashbuckle adds a discriminator section that is completely missing. OpenApi seems to be quite flexible in some areas; is this just another way of representing things. Code generator will have to handle both?

Steps To Reproduce

https://github.com/dnv-kimbell/openapi-inlineschema

Exceptions (if any)

No response

.NET Version

9.0 RC1

Anything else?

No response

captainsafia commented 1 month ago

The type Dog is renamed to PetDog; not expected.

This allows us to disambiguate between Dog used in a polymorphic hierarchy (where a discriminator property must be defined) versus Dog used on its own (which serializes directly without need for a discriminator).

Pet has the name property defined in code, but this shows up on PetBase and PetDoc in schema. I can't see how you can write code that operates on the Pet type and use the Name property.

Hm....can you clarify what you mean by this? I assume you're referring to being able to use the PetBase.Name property? As mentioned above, the distinction between Pet and PetBase is primarily around which one is part of the discriminated hierarchy and which type stands on its own.

Swashbuckle adds a discriminator section that is completely missing. OpenApi seems to be quite flexible in some areas; is this just another way of representing things. Code generator will have to handle both?

Your intuition is right here. The OpenAPI specification is a bit flexible in the way it defines support for discriminators in the spec. For our purposes, we've decided to stick true to the definition of the spec. Specifically, the this section of the specification, which states:

The expectation now is that a property with name petType MUST be present in the response payload, and the value will correspond to the name of a schema defined in the OAS document. Thus the response payload:

In sum, the spec requires that when you use the discriminator property every subschema must define the discriminator property.

In the type:

[JsonPolymorphic(TypeDiscriminatorPropertyName = "$dis")]
[JsonDerivedType(typeof(Cat), typeDiscriminator: "cat")]
[JsonDerivedType(typeof(Dog), typeDiscriminator: "dog")]
public class Pet
{
    public string Name { get; set; } = default!;
}
public class Dog : Pet
{
    public string? Breed { get; set; }
}

public class Cat : Pet
{
    public int? Lives { get; set; }
}

The Pet type doesn't define a discriminator and in this case STJ will determine what to do based on the JsonUnknownDerivedTypeHandling. To explicitly define the mapping, you can add a [JsonDerivedType(typeof(Pet), typeDiscriminator: "pet")] attribute so that the all possible subtypes are explicitly defined.

dnv-kimbell commented 1 month ago

I'm still very confused.

In the spec you reference, Cat, Dog and Lizard are referenced using oneOf. In the generated schema, it's anyOf.

The Pet type gets defined as any of PetCat, BetDoc and PetBase. On the client, I may want to write code that operates on the PetBase type. How can I figure out that PetCat inherits from PetBase and not PetDog? Is something magically encoded into the name since it ends with Base? What if you have deeper inheritance hierarchies?

Let's say you want to generate a .NET object model to handle these types; you want to specify the discriminator attribute. How can you figure out the name based on the generated schema? Is this again some magical naming based on properties that start with $? Different types may use different names for the discriminator.

The generated schema also uses a single valued enum for the discriminator; not seen this before.

The schema you currently generate may be technically correct, but very hard to read. With the Swashbuckle version, it's very clear what the subtypes of Pet are, and what the base type of Dog is.

captainsafia commented 1 month ago

In the spec you reference, Cat, Dog and Lizard are referenced using oneOf. In the generated schema, it's anyOf.

The discriminator property mappings are legal to use with any of the composite keywords:

The discriminator object is legal only when using one of the composite keywords oneOf, anyOf, allOf.

Is something magically encoded into the name since it ends with Base? What if you have deeper inheritance hierarchies?

Inheritance hierarchies are slightly different from polymorphic types. We don't do anything to model inheritance hierarchies in the implementation at the moment.

The generated schema also uses a single valued enum for the discriminator; not seen this before.

OpenAPI v3.0 doesn't provide support for communicating that a value is a constant. Single-valued enums are a strategy for defining constant values in the schema. In this case, it's used to indicate that for a given polymorphic PetDog the discriminator property must be dog.

The schema you currently generate may be technically correct, but very hard to read. With the Swashbuckle version, it's very clear what the subtypes of Pet are, and what the base type of Dog is.

Yes, technically correct is what we are striving for here. It allows the implementation to evolve with the JSON schema and OpenAPI specifications. There will be some growing pains as we sort out scenarios where alternative implementations took more liberties with the schema generated but the goal of the implementation is maximum compatibility with the specification.

At times, this might mean that we discover gaps that the specification doesn't address nicely (for example, the bitwise enums scenario we were discussing in another thread). In those cases, our focus is on working to make improvements to the underlying spec so that it's more explicit instead of codifying non-specced behaviors in our implementation.

Out of curiosity, it seems like you're trying to integrate the new tool into some existing set up. Is there a particular client generation tool that you're using that isn't playing nice with these scenarios or are you trying to implement something yourself?

dnv-kimbell commented 1 month ago

You would be correct in that we have an existing setup that we are looking at integrating this new functionality.

I work on the platform team for an in-house set of applications; 150 repositories, around 250 applications (web + windows services). Some of these have ancestry back to .NET 1.0 and ASMX. Around 60 of these applications expose api's consumed by other applications. We create nuget packages that wrap these api's making it easy for other applications to consume them. Most of our applications are now on .NET8 and we have started looking into .NET9; keeping things updated is important to us.

Our developer pool spans from senior developers to juniors straight out of school. With the number of applications we have, reducing developer friction and standardizing the way things are done is something we put some thought into.

On the server we have been using Swashbuckle and for the nuget packages, we have been using NSwag or manually coded.

When Microsoft announced additional support for OpenApi, we decided to investigate it to see if we could reduce some of our dependencies. Our plan is to use MS as the default, and Swashbuckle as a backup for the cases that MS doesn't support.

NSwag has existed for years; something that is reflected in the number of options available. For some things it seems to be most happy when using Newtonsoft and I'm not too keen on the way it uses exceptions for non-200 status codes.

We decided to create our own code generation tool that sets things up the way that makes most sense for us. Our first pre-release is based on what Swashbuckle produces. The plan was to wait to see how well it plays with the MS stuff before we started pushing it out further. I'm writing this tool, so all these details matter to me.

Out of curiosity, I downloaded the latest version of NSwag studio and used the schema from my repro. The generated code is lacking information, so it doesn't look like well established tools can process the current generated schema. I've removed some lines to make it more compact.

public partial class Pet
{
}
public partial class PetBase
{
    [System.Text.Json.Serialization.JsonPropertyName("name")]
    public string Name { get; set; }
}
public partial class PetCat
{
    [System.Text.Json.Serialization.JsonPropertyName("$dis")]
    [System.ComponentModel.DataAnnotations.Required(AllowEmptyStrings = true)]
    [System.Text.Json.Serialization.JsonConverter(typeof(System.Text.Json.Serialization.JsonStringEnumConverter))]
    public PetCatDis Dis { get; set; }

    [System.Text.Json.Serialization.JsonPropertyName("lives")]
    public int? Lives { get; set; }

    [System.Text.Json.Serialization.JsonPropertyName("name")]
    public string Name { get; set; }
}
public partial class PetDog
{
    [System.Text.Json.Serialization.JsonPropertyName("$dis")]
    [System.ComponentModel.DataAnnotations.Required(AllowEmptyStrings = true)]
    [System.Text.Json.Serialization.JsonConverter(typeof(System.Text.Json.Serialization.JsonStringEnumConverter))]
    public PetDogDis Dis { get; set; }

    [System.Text.Json.Serialization.JsonPropertyName("breed")]
    public string Breed { get; set; }

    [System.Text.Json.Serialization.JsonPropertyName("name")]
    public string Name { get; set; }
}
public partial class PolymorphicRequest
{
    [System.Text.Json.Serialization.JsonPropertyName("animals")]
    public System.Collections.Generic.ICollection<Pet> Animals { get; set; }
}
public partial class PolymorphicResponse
{
    [System.Text.Json.Serialization.JsonPropertyName("animals")]
    public System.Collections.Generic.ICollection<Pet> Animals { get; set; }
}
public enum PetCatDis
{
    [System.Runtime.Serialization.EnumMember(Value = @"cat")]
    Cat = 0,
}
public enum PetDogDis
{
    [System.Runtime.Serialization.EnumMember(Value = @"dog")]
    Dog = 0,
}
dnv-kimbell commented 1 month ago

I'm guessing nothing much will be happening on this before RTM in a month, so I've started exploring workarounds using the transformer infrastructure.

In an IOpenApiSchemaTransformer I can access polymorphic information, but they reference .NET types. The OpenApiSchemaStore is internal, so there doesn't seem to be way of mapping a type to a schema id.

In the IOpenApiDocumentTransformer, the provided document.Components is null. If that had values, the schema transformer could inject some internal reference syntax that we could then try to map correctly in the document transformer.

marinasundstrom commented 3 weeks ago

@dnv-kimbell I'm moving from NSwag, and I noticed a lot of differences in how nullability and inheritance is being deal with. So I'm also investigating making transformers to fill the gap. A kind of compatibility package. Generally, that is how I believe specific behavior should be added.

Even if this doesn't lead anywhere I learn a lot about the API and its quirks.

marinasundstrom commented 3 weeks ago

This is to highlight the difference between the new OpenAPI generator and NSwag.

This is what I get from inheritance as is:

(YAML is generated with an extension by Martin Costello)

components:
  schemas:
    Animal:
      required:
        - species
      type: object
      anyOf:
        - $ref: '#/components/schemas/AnimalDog'
        - $ref: '#/components/schemas/AnimalCat'
      discriminator:
        propertyName: species
        mapping:
          Dog: '#/components/schemas/AnimalDog'
          Cat: '#/components/schemas/AnimalCat'
    AnimalCat:
      required:
        - name
      properties:
        species:
          enum:
            - Cat
          type: string
        name:
          type: string
    AnimalDog:
      required:
        - foo
      properties:
        species:
          enum:
            - Dog
          type: string
        foo:
          type: boolean

This is what NSwag would have [roughly] produced:

(It is a reconstruction)

components:
  schemas:
    Animal:
      required:
        - species
      type: object
      discriminator:
        propertyName: species
        mapping:
          Dog: "#/components/schemas/Dog"
          Cat: "#/components/schemas/Cat"
      x-abstract: true
    Cat:
      allOf:
        - $ref: "#/components/schemas/Animal"
        - type: object
          additionalProperties: false
          properties:
            name:
              type: string
    Dog:
      allOf:
        - $ref: "#/components/schemas/Animal"
        - type: object
          additionalProperties: false
          properties:
            foo:
              type: boolean

The most notable thing here is the use of allOf to indicate that a subtype inherits a schema.

This might be an opinionated approach. But this should be taken into account. Without it people won't move over to use this API.

There are also extension like x-abstract that add some extra information for generators like NSwag.

marinasundstrom commented 3 weeks ago

Here is a prototype of transformer, some for inheritance the NSwag-style. There are limitations in the Open API API/infrastructure.

https://github.com/marinasundstrom/NullabilityTransformersPrototype

https://github.com/marinasundstrom/NullabilityTransformersPrototype/blob/f36b3a14df2030b836c1ce87da12805a87b7c6e5/WebApi/Extensions/Transformers.cs#L29

Also https://github.com/RicoSuter/NJsonSchema/issues/1739.

dnv-kimbell commented 2 weeks ago

@captainsafia this is additional information for your question in #58406 about missing information.

Lets say you want to generate code like this

[JsonPolymorphic(TypeDiscriminatorPropertyName = "$dis")]
[JsonDerivedType(typeof(Cat), typeDiscriminator: "cat")]
[JsonDerivedType(typeof(Dog), typeDiscriminator: "dog")]
public class Pet
{
    public string Name { get; set; } = default!;
}
public class Dog : Pet
{
    public string? Breed { get; set; }
}

public class Cat : Pet
{
    public int? Lives { get; set; }
}

Based on the information we get from Swashbuckle,

"Pet": {
  "required": [
    "$dis"
  ],
  "type": "object",
  "properties": {
    "$dis": {
      "type": "string"
    },
    "name": {
      "type": "string",
      "nullable": true
    }
  },
  "additionalProperties": false,
  "discriminator": {
    "propertyName": "$dis",
    "mapping": {
      "dog": "#/components/schemas/Dog",
      "cat": "#/components/schemas/Cat"
    }
  }
},

It's very easy to generate the code since all the information is located in the same place. Now let's have a look at what the Microsoft implementation produces.

"Pet": {
    "type": "object",
    "anyOf": [
        {
            "$ref": "#/components/schemas/PetCat"
        },
        {
            "$ref": "#/components/schemas/PetDog"
        },
        {
            "$ref": "#/components/schemas/PetBase"
        }
    ]
},
"PetBase": {
    "properties": {
        "name": {
            "type": "string"
        }
    }
},
"PetDog": {
    "required": [
        "$dis"
    ],
    "properties": {
        "$dis": {
            "enum": [
                "dog"
            ],
            "type": "string"
        },
        "breed": {
            "type": "string",
            "nullable": true
        },
        "name": {
            "type": "string"
        }
    }
},

When I scan through the schemas, I come across one that contains multiple entries under anyOf. Does this mean that it will always be polymorphic? If anyOf can be used by other constructs, how does one identify polymorphic? I sure don't know OpenApi that well, so it becomes a crap shoot if we can handle all cases.

How does one determine the name of the discriminator? Do you look at all the schemas referenced from anyOf and see what property is common and starts with $?

PetBase appears out of nowhere. How do we work around that? Find a type that ends with 'Base', then remove those properties from all types in anyOf that is not the base, then add them to the type that lists anyOf?

Looking through our codebase, we have a limited number of services using polymorphism; for these we are sticking with Swashbuckle.