Fix issue with modeling inheritence

xirzec commented 2 years ago

Cat and Dog have an Animal base model but we generate type AnimalUnion = Animal | Cat | Dog instead of type Animal = Cat | Dog

In these cases the base model (e.g. Animal) contains common properties from the server, but never is actually returned. This is a workaround for the lack of anyOf and oneOf in the modeler.

For now an x-ms- property that omits the base type would be sufficient.

craigsmitham commented 2 years ago

There is an additional problem if you just omit the base type (Animal) from the union, because the sub-types are defined as an intersection between their properties and the base class properties (type Cat = Animal & { kind: 'cat'; meows: boolean; }). This is problematic for type narrowing, because it results in a type Cat = { kind: 'animal' | 'cat' | 'dog', meows: boolean; }.

If it's modeled using allOf, I think the correct way to model it is still to include the base type. Whether or not the base type is returned from the server may or may not be the case. It's entirely possible it could be. What's more of an issue is ensuring that the discriminator is properly generated. Here is my understanding of the current state of how the above scenario is currently being generated (for the allOf case, with kind being the discriminator property):

type AnimalUnion = Animal | Cat | Dog;

type Animal {
   kind: 'animal' | 'cat' | 'dog';
   species: string;
}

type Cat = Animal & { kind: 'cat', meows: boolean; }

type Dog = Animal & { kind: 'dog', barks: boolean; }

Instead it should be (in order to be able to narrow/infer types with TypeScript):

type AnimalUnion = Animal | Cat | Dog;

type Animal {
   kind: 'animal';
   species: string;
}

type Cat = Omit<Animal, 'kind'> & { kind: 'cat', meows: boolean; }

type Dog = Omit<Animal, 'kind'> & { kind: 'dog', barks: boolean; }

The important changes above are: 1) the `kind` discriminator for the base type should just specify the type for itself, it should not concern itself with modeling the union of possible kinds - that is done via the union type. It's still important to have the base type in the generation and not omit it, so that any shared inherited properties are shared and if the API does return it, it can be handled by the client. 2) The `kind` discriminator is omitted from the inherited type when it is used to define the sub types

In my situation I want to enable type narrowing by having properly modeled inheritance hierarchy, like the one above. In my case I would probably opt for using oneOf if it were a capability of Autorest, because I agree base types on the server are more used as a work-around to describe union types and not usually returned to the client. In absence of the oneOf capability, I still think I could get the allOf approach to work, but it need to support the ability to do type narrowing, which would be a huge benefit to the developer experience.

xirzec commented 2 years ago

Inheritance doesn't exist in JSON, so there's no reason to model it in our interfaces, we can inline shared properties:

type Animal = Cat | Dog;

interface Cat {
   kind: "cat"
   mammal: boolean;
   meows: boolean;
}

interface Dog {
   kind: "dog"
   mammal: boolean;
   barks: boolean;
}

This is the cleanest and most obvious conversion that is fully compatible with oneOf. If we really want to re-use model shapes as a form of inheritance, I think it would look like this:

type Animal = Cat | Dog;

interface BaseAnimal {
    mammal: boolean;
}

interface Cat extends BaseAnimal {
   kind: "cat"
   meows: boolean;
}

interface Dog extends BaseAnimal {
   kind: "dog"
   barks: boolean;
}

This avoids the problems you specify above and doesn't pollute the "Animal" type with some abstract base Animal that doesn't actually exist in the service's public shapes.

craigsmitham commented 2 years ago

I agree the type Animal = Cat | Dog; would be appropriate when seeking to model oneOf/anyOf. I wish we had oneOf/anyOf support, but that doesn't seem to be the scope of this issue.

If we are going to infer an inheritance hierarchy using allOf, to be consistent I think one of the options that can be returned in the union type should be the base class. It may or may not be abstract, depending on the individual API implementation.

The more I think about this though, is modeling hierarchical inheritance appropriate at all when using allOf? It seems like a nice way to share properties within API definitions, but is not intended to model hierarchical inheritance.

Bottom line: I'm all for the original proposed solution for the case when allOf is used to model inheritance (e.g. a x-ms- property). However, I just want to highlight he need to also ensure the discriminator field is generated to support type narrowing (as @xirzec 's comment appears to do).

xirzec commented 2 years ago

I agree the type Animal = Cat | Dog; would be appropriate when seeking to model oneOf/anyOf. I wish we had oneOf/anyOf support, but that doesn't seem to be the scope of this issue.

Fair. I think the heart of my comments (despite my poor issue title) is I would like to support service devs who are working around the lack of anyOf/oneOf. So I want them to be able to write a swagger (with what we support today) that produces output that looks like they weren't working around a lack of support on our end. 😄

The more I think about this though, is modeling hierarchical inheritance appropriate at all when using allOf? It seems like a nice way to share properties within API definitions, but is not intended to model hierarchical inheritance.

I agree that we are abusing the way allOf works today. I think it was intended to just be a shorthand to make swagger files less overwhelming than repeating properties.

bterlson commented 2 years ago

The more I think about this though, is modeling hierarchical inheritance appropriate at all when using allOf? It seems like a nice way to share properties within API definitions, but is not intended to model hierarchical inheritance.

I believe that allOf is explicitly not inheritance. The spec at least strongly implies this in a few cases.

I call what is being proposed here Johanian inheritance, as it's distinct from pure OA3 semantics and is designed to e.g. make adding variants to a union a non-breaking change. FWIW I'm not yet convinced this that good of a practice considering its deleterious effects on e.g. control flow narrowing based on discriminants. @Johanste may have additional thoughts here.

johanste commented 2 years ago

There is a class of APIs where new "kinds" of a given abstract resource type are added over time. In the pet store example, that is the type of pets. In Azure, such examples include the type of data source that you ingest data to for anomaly detector. In ARM, that is new resource types. Or a new system event in event grid. And so on. The canonical pattern for this is a discriminated (tagged) type with a "kind" field indicating what specific subtype you are working with.

If these resources are persisted, it is very challenging to hide the newly introduced resource types from old clients (which is the base strategy for versioning Azure service APIs) - when you list the data sources in the anomaly detector, you don't want to get an empty list just because your client was written before the data source was introduced.

The problem with openapi/swagger is that there is an assumption that the universe of types is known/all types are listed. This does not play nice with new types being added in the future - even if the service description can be updated, all clients that used earlier versions of said service description would still have to be tolerant to receiving the new types. Which means that they must be prepared to handle it one way or another.

There are a couple of strategies that have been used; return the common base type with all new type specific information sliced off, return the common base type with a separate bucket of additional data, return a predefined "other" type that is loosely typed etc.

The "other" type pattern can either be implemented by the service (effectively, it will instead of sending {"kind": "snake", ...} to an old client, it will send { "kind": "other", "kindName": "snake", ...}, and have a pet type "other" defined in the service definition), or it can be implemented by the client. The problem with the service doing it is that if you didn't define the "other" type beforehand, you are stuck. For a client library that returns the sliced off base class when presented with an unknown discriminator value, we can generally add the mapping locally in a non-breaking fashion once the problem is introduced.

bterlson commented 2 years ago

The other idea is interesting... Rather than generating either type AnimalUnion = Animal | Cat | Dog or type Animal = Cat | Dog, we can generate type Animal = Cat | Dog | OtherAnimal; where type OtherAnimal = BaseAnimal & { additionalProperties: Record<string, unknown> }? I might be ok with this kind of projection only in cases where we come across an unknown discriminant...

bterlson commented 2 years ago

Unfortunately the challenge with this approach is that if BaseAnimal has a type: string property, then this approach still breaks control flow based narrowing because if (animal.type === 'cat') matches the 'cat' and string case.

I wonder if we could get away with reserving a discriminant value of 'other' to use in this case, so we can have animal.type === 'other' and define type OtherAnimal = BaseAnimal & { wireType: string; additionalProperties: Record<string, unknown> }? @Johanste any thoughts on this approach?

(the name wireType is definitely open for bike shedding, this is the value of the discriminant field as it appeared on the wire)

johanste commented 2 years ago

An additional requirement that I'd like to see met is that, as you upgrade to a new version of your library - e.g. one that understands the new animal, your code would not break. Otherwise, we'd have to be careful about addition of new model types being considered breaking changes. Hopefully that doesn't bring us all the way back to the event grid style type fences that @bterlson dislikes...

bterlson commented 2 years ago

Ahh yeah, this ruins the entire proposal. Hmm.

craigsmitham commented 2 years ago

Should the addition of a new member to a union type be considered a breaking change?

1) it doesn't break the run-time behavior of existing code and 2) in the scenario of using using a switch and narrowing types, the consuming client could have coded using a default case in a way to handle future types. If they don't use the default case, then they want to be notified ("broken") if additional types are added. This seems to be a nice middle ground and shouldn't be considered a "breaking" change. Feature not a bug.

bterlson commented 2 years ago

In a world where we don't have the base type participating in the union in some way (e.g. as an item in the union, or the other approach I sketched), adding an item to a union that is an output type is potentially breaking to runtime code. Consider:

if (obj.type === 'Dog') {
   // obj is narrowed to dog
} else {
    // obj is narrowed to cat
}

If we add 'lizard', the else block will think a lizard is a cat.

In general, if a user needs to take code changes in order to adopt a new library, we tend to consider that breaking, which I think aligns with semver which would require a major version in such cases.

I agree with you though that, also in general, updating to a new major and fixing red squiggles is a viable alternative to e.g. adding new and possibly confusing client shapes to allow a seamless upgrade.

craigsmitham commented 2 years ago

Your example makes sense. I can see how a simply adding a member to a union type would break existing code, and it's reasonable to consider any required code fixes to be a breaking change requiring a semver major version bump.

In my case, I'm simply concerned more about developer ergonomics for a continuously (non-versioned) internal API, so I wanted type narrowing provided by simple TypeScript union types. However, I can see how this poses challenges when trying to support backwards compatibility.

Perhaps assertion functions could be generated for each possible response type? e.g. function isCat({ kind: 'cat' | 'dog' }: animal): animal is Cat. That would read nicely, but requires additional imports that may not be intuitive.

bterlson commented 2 years ago

The type guards approach is what @johanste alluded to above with

Hopefully that doesn't bring us all the way back to the event grid style type fences that @bterlson dislikes...

My main issue with this approach is what you called out - without documentation and probably explicit samples showing it, I don't see people realizing they can do this, and so they'd just hobble along with the less-than-ideal but version tolerant API they are given.

craigsmitham commented 2 years ago

What about this? @bterlson this is inspired by your use of other value in the discriminator in your earlier comment. Every sub-type discriminator should include the kind(s) of its ancestors, instead of the ancestor discriminators including the possible kinds of its descendants. This supports idiomatic union type narrowing and doesn't break when new types are added to the union type.

interface Animal {
  kind: 'animal'
  mammal: boolean;
}

type Cat = Animal & {
  kind: 'cat' | 'animal' ,
  meows: boolean;
}

type Dog = Animal & {
  kind: 'dog' | 'animal';
  barks: boolean;
}

type AnimalResponse = Animal | Cat | Dog;

craigsmitham commented 2 years ago

@bterlson @xirzec any thoughts on above approach for modeling inheritance when using allOf? It allows for idiomatic use of type narrowing with union types and doesn't break when additional types are added as possible subtypes, while avoiding the verbose event grid style type fences.

xirzec commented 2 years ago

@bterlson @xirzec any thoughts on above approach for modeling inheritance when using allOf? It allows for idiomatic use of type narrowing with union types and doesn't break when additional types are added as possible subtypes, while avoiding the verbose event grid style type fences.

I'm still a little confused by the above example since my understanding is that the server never sends kind: 'animal' - it instead will start sending some new type (e.g. kind: 'lizard') and if we have AutoRest map this unknown kind into something generic (like 'animal'), that won't help us when AutoRest regenerates later with proper lizard support (the dev will need to change their code to detect it.)

Also when I try to use your example in the playground it appears to narrow AnimalResponse to being exactly animal with no other kinds permitted:

benlongo commented 1 year ago

It's unfortunate the generated types don't allow narrowing, that's critical to my use case. I don't think the framework should concern itself with updating behavior - it should faithfully model what the openapi spec says and nothing more.

jrpedrianes commented 1 year ago

Is there any workaround to avoid the current behaviour?

xirzec commented 1 year ago

@jrpedrianes not without wrapping the generated client and doing something clever.

@bterlson how does CADL solve this problem? I feel like the answer for an extensible union has to be in the implicit existence of an unknown kind.

Azure / autorest.typescript

Fix issue with modeling inheritence #1132