[Multiple languages] properties are generated as nullable/optional, despite required/nullable usage in OAS schema

bkoelman commented 9 months ago

Kiota ignores the usage of required/nullable in OAS schemas. C# properties are always generated as nullable. This results in unneeded noisy squiggles, as shown here.

NSwag does a better job, by performing the following translation (using the /GenerateNullableReferenceTypes:true switch):

OAS non-required + nullable: C# nullable
OAS required + nullable: C# nullable
OAS non-required + non-nullable: C# non-nullable, unless combined with the /GenerateOptionalPropertiesAsNullable:true switch
OAS required + non-nullable: C# non-nullable

Additionally, NSwag adds [Newtonsoft.Json.JsonProperty(Required = ..., , NullValueHandling = ...)] on the generated properties, which results in client-side serialization errors (thus avoiding a server roundtrip on invalid input).

Example component schema

```json "exampleComponent": { "required": [ "requiredNonNullableReferenceType", "requiredNullableReferenceType", "requiredNullableValueType", "requiredValueType" ], "type": "object", "properties": { "nonNullableReferenceType": { "type": "string" }, "requiredNonNullableReferenceType": { "type": "string" }, "nullableReferenceType": { "type": "string", "nullable": true }, "requiredNullableReferenceType": { "type": "string", "nullable": true }, "valueType": { "type": "integer", "format": "int32" }, "requiredValueType": { "type": "integer", "format": "int32" }, "nullableValueType": { "type": "integer", "format": "int32", "nullable": true }, "requiredNullableValueType": { "type": "integer", "format": "int32", "nullable": true } }, "additionalProperties": false } ```

The motivation for this behavior in Kiota is provided here and here. I find it unfortunate that correct server implementations have to suffer from this.

Please upvote this issue if you'd like Kiota to respect the required/nullable usage in your OAS.

markm77 commented 9 months ago

Interesting that you should write about this. I'm currently moving away from AutoRest and have been evaluating Kiota and NSwag and this is the issue that is pushing me to NSwag (despite NSwag's lack of updates and catalogue of open issues). I recently watched the .NET video on Kiota and there are many things to like but unfortunately this issue is the blocker.

Basically the consequence of the current behaviour is that I would need to create a second set of models with corrected nullability which makes no sense. Or make copious manual edits to the auto-generated models.

PS: my preference would be for a "strict nullability" option which produces non-nullables for the case of "OAS required + non-nullable". Nullables for the case of "OAS non-required + non-nullable" are fine and even desirable (as null simply means property absent from serialised object).

baywet commented 9 months ago

Thanks @bkoelman for creating this focused issue and bringing everyone to the discussion table. This could represent a breaking change depending on how we design the change, I've put it to the v3 milestone for the time being, but let's keep the conversation going.

I'd like to explore a solution where we wouldn't need an additional switch on the CLI. Not only this keeps things simple to anybody using kiota, this also reduces the number of cases to implement, test for and support.

I like the solution proposed by @markm77 where, if I understood well, and assuming we don't have any switch, we'd only make as non-nullable properties/return types that are required and non-nullable in the description. This is a very specific case to implement, with fewer ramifications, and potentially non-breaking. Any thoughts on that?

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

baywet commented 9 months ago

Hi everyone, We've talked in depth about the nullable aspect of this during our planning meeting. I'm going to capture the notes here to help fuel the discussion.

on adding a switch

We're pushing back against that aspect, not only for the simplicity as I outlined, but also for the maintainability.

Here is a truth table considering the different variables, which already gives us 8 different permutations. Multiply this by the number of languages, we're effectively in the 50+ permutations to implement, test, and maintain, this would represent too much and we believe that's one of the reason why some other generators out there are struggling.

required	nullable	flag
0	0	0
1	0	0
0	1	0
1	1	0
0	0	1
1	0	1
0	1	1
1	1	1

on the breaking change aspect

While depending on the project setup for a dotnet developer, this change might or might not become a breaking change, it's clearly one for other languages like Java for instance. And we strive to only break people on major revisions, so this would have to be scheduled accordingly.

Here is an example, let's assume the foo property is boolean and non nullable today in the description. Today in Java the model would look like this.

class Model {
  Boolean foo;
}

and would change to that

class Model {
  boolean foo;
}

The change of capitalization of the type means we've effectively switched from a pointer type (true|false|null) to a value type (true|false) and the code depending on that property would need to change to some extent.

We could alleviate some of that by adding backward compatible code and making use of the ecb switch

class Model {
  @Deprecated
  Boolean foo;
  boolean getFoo() {
    if (foo == null) {
      throw;
   }
   return foo.Value;
  }
//...
}

But this gets ugly really quick.

on the serialization writer and the parsenode interfaces

This effectively means that all primitive methods in this interfaces need to be duplicated. Which would tax the implementers.

public interface ParseNode  {
    bool? GetBooleanValue();
        bool GetNonNullBooleanValue(); // this would need to be added
}

instead of doing that we could add static helpers in abstractions to reduce the burden on code generation and implementers

public static ParseNodeNonNullHelpers {
     public static bool GetNonNullBooleanValue(Func<bool?> getter) {
          if (getter() is bool result) return result;
          throw new InvalidOperationException("the value was null when it wasn't expected");
          // this is helpful since it helps with the validation of the incoming and outgoing payloads
    }
}

on microsoft graph descriptions

While investigating this we discovered the Microsoft Graph OpenAPI descriptions are inaccurate around that topic. They are being generated from CSDl by OpenAPI.net.OData, a library our team also owns. I've created two issues over there to help solve this. https://github.com/microsoft/OpenAPI.NET.OData/issues/466 https://github.com/microsoft/OpenAPI.NET.OData/issues/467

We still need to think further about the implications of required. But feedback on all these notes and the previous reply is most appreciated.

bkoelman commented 9 months ago

I like the solution proposed by @markm77 where, if I understood well, and assuming we don't have any switch, we'd only make as non-nullable properties/return types that are required and non-nullable in the description. This is a very specific case to implement, with fewer ramifications, and potentially non-breaking. Any thoughts on that?

Sounds good to me. This matches NSwag with /GenerateNullableReferenceTypes:true /GenerateOptionalPropertiesAsNullable:true, which is what I've seen being recommended in blog posts. I guess the second switch was introduced later because (some) developers didn't like the initial puristic approach.

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

This is where the Newtonsoft attributes come in. They drive how to (de)serialize, potentially throwing when expectations aren't met (for example, serializing a non-nullable reference-type property that is null). Kiota may need to introduce something similar, being taken into account during (de)serialization. In NSwag, a non-nullable ref/value-type property that contains its default value isn't serialized because of [Newtonsoft.Json.JsonProperty(Required = ..., , NullValueHandling = ...)] usage. Kiota can provide a superior experience when --backing-store is used, but that's an opt-in feature.

markm77 commented 9 months ago

Thanks for the detailed replies @baywet .

Assuming we go down that route, what would you expect during serialization/deserialization? if an object has a non-nullable property, the default value would be serialized when not set, potentially leading to side effects. If a payload doesn't contain the value for a non-nullable property, the default value would be set, potentially also leading to side effects?

Considering just the proposed change case of OAS required + non-nullable, my preference would be:

[de-serialisation] if a property is not present, a de-serialisation error results due to it being declared "required" to the de-serialiser (a received null would ideally also trigger an error)
[serialisation] the case of the property not being set is blocked by it being required in code (e.g. through the constructor enforcing this or by some other means such as the new C# required keyword). In no case should null or a default/unset value be serialised.

bkoelman commented 9 months ago

Related to this, NSwag goes beyond just scalar properties. I've seen cases where it generates:

public class Customer
{
    [Newtonsoft.Json.JsonProperty( ... )]
    public CustomerAddress Address { get; set; } = new CustomerAddress()
}

baywet commented 9 months ago

Thanks everyone for the input. I wouldn't design anything depending on the backing store. While the backing store is a nice feature in itself, it's orthogonal and optional.

@bkoelman I don't think you're suggesting taking a dependency on newtonsoft annotations, but I want to make it clear to other readers: we wouldn't go that way. Not only because our canonical JSON implementation relies on system.text.json instead, but also because it'd go against one of our key principles of having the generated outcome not depend on any specific serialization format/library.

As for erroring our on serialization/deserialization when the property is null/missing (regardless of whether it's scalar or not), I think we're on the same page, the helper methods design I eluded to in my earlier reply could also be used for serialization.

Thanks everyone for the feedback. Besides further discussions on required, I think we have a narrow enough improvement and a clear design for the nullability aspect.

Don't hesitate to further the discussion on the required aspect.

bkoelman commented 9 months ago

I don't think you're suggesting taking a dependency on newtonsoft annotations

That's true. It's just an implementation detail in NSwag to make Newtonsoft do the right thing. It would be fine if Kiota handled this differently. The point is that the serialization layer becomes aware somehow.

As for erroring our on serialization/deserialization when the property is null/missing (regardless of whether it's scalar or not)

I tried to express that NSwag allocates an empty instance on required/non-nullable non-scalar properties, though I did not explicitly mention that. Doing so is merely a convenience, but if Kiota is going to support that as well, it may impact how the backing store works (it still needs to track the assignment).

darrelmiller commented 8 months ago

The primary problem is that the use of required/nullable in OpenAPI is broken. We have discussed this extensively in the OpenAPI meetings and with the JSON Schema maintainers. Most people want to describe a single "type schema" for their CRUD operations, but OpenAPI does not provide the facility to vary the constraints of a JSON Schema depending if you are creating, reading or updating a resource.

What most people want the majority of the time is x-RequiredForCreate on a property. It is rare for a property to be required when doing a PATCH. It gets tricky on upserts because there is no way a client can validate whether a property is required because it doesn't know if the PUT/PATCH is going to do a create or an update. Also, to be pedantic, we don't actually ever know when a request is doing a create until you get a 201 back. The POST method doesn't guarantee anything is being created. Are there other scenarios where "required" is useful other than during creation?

Unfortunately, we are all trying to make the best of a bad situation. NSwag's approach to providing a bunch of switches so the user can choose what they want is one way to do this. As @baywet has said, we are trying really hard to avoid taking the route of adding 101 switches in order to maintain simplicity and reduce the potential for bugs.

Respecting non-nullable for required fields will work for some APIs, but it becomes quite problematic for APIs that allow projections on GET requests. If I do GET /patients?fields=firstName,LastName and the patient type has a non-nullable & required dateOfBirth field then we are going to run into problems on the client in strongly typed languages.

This problem does need to be fixed. But it needs to be fixed in OpenAPI/JSON Schema land before we can address it properly in generated clients. This is primary reason why we have taken a very loose approach to dealing with payload data constraints up to this point.

bkoelman commented 8 months ago

The primary problem is that the use of required/nullable in OpenAPI is broken.

I disagree. It is well-defined, just not so convenient for simplistic unstructured APIs, like the default Web API template in Visual Studio, where models are exchanged directly without adhering to any well-defined protocol. Developers often start out that way, only later realizing all the benefits of a structured approach such as JSON:API that takes away the guesswork from consumers. Using JSON:API (and probably GraphQL too) doesn't have any of the issues you mentioned. The problem that needs to be fixed is in Kiota (which doesn't respect the schema), not in OpenAPI itself. I'm not interested in your plans to incorporate into OpenAPI what other standards already provide. Professional APIs like Stripe and GitHub have overcome these concerns long ago, it's just unexperienced API developers running into these.

And yes, once a structure is involved, there are cases where required non-nullable occurs in PATCH request bodies, see https://jsonapi.org/format/#crud-updating. But also at many other places, like error objects being returned, identifier objects that must consist of type/id etc. The model properties you're referring to are just a small part of the payload.

Most people want to describe a single "type schema" for their CRUD operations

I'm not convinced of that. Do you have any evidence? We deliberately chose not to, which solves the post/patch and ?fields issues.

Today the nullability/required handling is superior in NSwag and Kiota feels clumsy in this regard. We don't need any switches in Kiota, just implement how NSwag works when both switches are on (which is what everybody uses in practice).

markm77 commented 8 months ago

Thanks @darrelmiller for your input.

Are there other scenarios where "required" is useful other than during creation?

Please remember we are talking about client-side models where personally the primary thing I want to "validate" is server responses (this is far more important to me than requests because I completely control those - although validation is useful there too).

I want to "fail fast" (ideally during deserialisation) if a server does not return a required non-nullable field in response to e.g. a POST or GET. I also want to have a non-nullable in my client-side model for such a field as the rest of my program operates on the assumption that the required non-nullable field has been supplied.

It strikes me also that the issues you mention can potentially be resolved by editing the OpenAPI document (e.g. creating an extra "type schema"). Before using an Open API client generator I usually need to do this to some extent anyway. But the problem here is incorrect nullability in the models produced which is basically impossible to fix without extensive work or duplicate models.

saithis commented 7 months ago

Most people want to describe a single "type schema" for their CRUD operations

Are you sure about that? Same as @bkoelman, we also deliberatelly chose to use different models. Even if they are 100% the same. We had one to many bugs, where a shared model got altered and broke another endpoint.

yasaichi commented 4 months ago

Any updates on this issue?

johncrim commented 4 months ago

After evaluating a number of alternatives, I chose Kiota for our client APIs and have been generally happy with it (thank you for your work on it!). In my experience, making all properties nullable is by far the biggest rough edge I've experienced with it. I've had to add many checks/assertions/branches that wouldn't be necessary, including having to check the OAS schema 10s to 100s of times to ensure that any given property is marked as required.

I would be very happy with generated C# that matches the required/nullable values in the schema. I'd also be happy with adding a switch/line in the lock file to enable it, if backcompat becomes a blocker.

haefele commented 3 months ago

Any updates on this? Having all model-properties be nullable makes my client code full of null-checks that I only need to satisfy the compiler.

AnderssonPeter commented 3 months ago

I'm not using NSwag, but Swashbuckle, i have a ISchemaFilter that adds properties to the Required array (source https://stackoverflow.com/a/68987970)

public class RequireNonNullablePropertiesSchemaFilter : ISchemaFilter
{
    public void Apply(OpenApiSchema model, SchemaFilterContext context)
    {
        var additionalRequiredProps = model.Properties
            .Where(x => !x.Value.Nullable && !model.Required.Contains(x.Key))
            .Select(x => x.Key);
        foreach (var propKey in additionalRequiredProps)
        {
            model.Required.Add(propKey);
        }
    }
}

This in turn it generates the following:

"TestResponse": {
    "additionalProperties": false,
    "properties": {
        "isValid": {
            "type": "boolean"
        },
        "userId": {
            "type": "string"
        }
    },
    "required": [
        "isValid",
        "userId"
    ],
    "type": "object"
}

But kiota is still generating nullable types when i generate a C# client, is this a bug in Kiota or do i need to specify that they aren't nullable some other way?

johncrim commented 3 months ago

@AnderssonPeter - that's this bug in Kiota - pls upvote, and ask anyone else affected to upvote it.

IkeOTL commented 3 months ago

My code should look like this:

But it looks like this due to this issue. Hideous!:

StefanOverHaevgRZ commented 2 months ago

@baywet I see this is planned for 2.0, as pointed out previously this will be a breaking change for most if not all languages. I also wanted to add that this applies to collection types: List<int> is generated as List<int?>?. From the previous timeline I guess 2.0 won't be ready as RC this year?

baywet commented 2 months ago

Yes it would apply to all the types. We don't have any plans (even internally) with timelines for a v2 at this point. We'll share more once we have a better idea to get the community's input. Right now TypeScript is the highest priority from a client generation perspective.

ThomasWillumsen commented 1 month ago

Kiota is next to useless for me when it doesnt honor nullable vs required properties.

Edit: We ended up staying with NSwag for now

MadL1me commented 1 month ago

Any updates on the issue?

We'll share more once we have a better idea to get the community's input. Right now TypeScript is the highest priority from a client generation perspective.

There are almost 70 likes on this disscussion, it's obviously a blocker for many people. I'm pretty sure it is the most required feature for users of the library right know (based on opened issues).

martin-hirsch commented 1 month ago

I 71st this. 👍

ldeluigi commented 1 month ago

On python this becomes even worse as it doesn't have the ! operator. The code becomes filled with not None assertions.

jacksoncougar commented 3 weeks ago

Was evaluating Kiota and this bug is a blocker to choosing it

adamvandenhoven commented 2 weeks ago

Nullability seems to be a particular pain point for a lot of people. I share the concerns and complaints. I also understand the issues around serialization.

I have the inkling of a solution that may help alleviate some of the difficulty.

I take it for granted that there is nothing inherent that makes generating classes that respect nullability. What if kiota generated both variants?

Leave the current existing IParsable models in Models and generate bare models in UnparsableModels (names suck) that honor the nullable property (and whatever else). Then kiota can generate something like:

namespace ApiClient.UnparsableModels {
  public partial class Model {
    public Model (){
      id = Guid.Empty; // or the value defined in the schema as the default value.
      name = string.Empty;
    }
    public Model (ApiClient.Models.Model model) {
      id = model.id ?? Guid.Empty; // or the value defined in the schema as the default value.
      name = model.name ?? string.Empty;
    }
    public Guid id {get; set;}
    public string name {get; set;}
    public static implicit operator Model(ApiClient.Models.Model model) => new Model(model);  
  }
}

namespace ApiClient.Models {
  public partial class Model : IParsable {
    public Model (ApiClient.UnparsableModels.Model model) {
       id = model.id;
       name = model.name;
    }
    public Guid? id {get; set;}
    public string? name {get; set;}

    public static implicit operator Model(ApiClient.UnparsableModels.Model model) => new Model(model);
  }
}

In languages without operator overloading like this, you could define FromUnparsableModel and FromModel methods or something that works for those languages.

It may be necessary to include a way that I as the developer can define how potential null values in the Model are handled when the schema's default is either not helpful or missing. Things like a Date doesn't have an obvious choice if its null and shouldn't be, and the default value may just be used as an example to populate the swagger ui with something that works.

I don't know how much this helps languages other than C# and I'm certain there is something I'm not taking into account but it seems like a good start to a solution that will satisfy since kiota ends up doing the repetitive parts at the expense of extra code that we'd need to write anyways as developers in the current situation.

microsoft / kiota