json-api-dotnet / JsonApiDotNetCore

A framework for building JSON:API compliant REST APIs using ASP.NET and Entity Framework Core.
https://www.jsonapi.net
MIT License
672 stars 160 forks source link

Explore usage of Microsoft.AspNetCore.OpenApi #1591

Open bkoelman opened 1 month ago

bkoelman commented 1 month ago

@bkoelman Following up on the comment that you left in the issue on the ASP.NET Core repo here.

The Microsoft.AspNetCore.OpenApi implementation has evolved to cover some of the scenarios that you listed originally including:

The biggest gap I noticed is around your schema generation-based customizations. Our implementation currently relies on the new JsonSchemaExporter APIs in System.Text.Json (.NET 9 Preview 6) and that layer doesn't have as many of the customizability options that Swashbuckle has. That being said, I'd be curious to understand the details of some of the items you mentioned in your bullet list from the comment above:

Custom handling of OpenAPI inheritance with allOf and discriminator that maps to JSON:API type

Do you use allOf to model inheritance hierarchies in your documents or is that something that you disable? This isn't an option that we currently support in M.A.OpenApi, primarily because its iffy whether or not allOf can be used to accuratnely model inheritance hierarchies in JSON schema.

What discriminators are you referring to in the statement above? At the moment, our implementation supports STJ's polymorphism attributes by default. Does your library provide a custom set?

For the JSON:API type, we generate a single-valued enum, so callers don't need to set a magic string

I assume you're doing this to work around the limitation in OpenAPI v3 that the JSON Schema const keyword is not supported? If so, I don't really have a good solution for that since our system has to comply with a similar constraint. We do expose a schema transformers concept (similar to schema filters) that can help here.

In JSON:API, sending null means to clear something, whereas not sending it means to leave it unchanged. So we need to handle OpenAPI required/nullable manually, ie [Required] on a property must not translate to non-nullable

Requiredness/nullability are modeled independently in our system so you should be able to model required properties that can be null if need be.

But to a more general point, I think your statement here:

And because Swashbuckle is now under active maintenance again, I suspect to finish our OpenAPI integration based on that. Because it is mature and stable, and our extensibility needs go way beyond what most other projects need.

is prudent and wise. It seems like you've done a lot of work in the area already building our Swashbuckle's model and I wouldn't through that away. In the future, it would be interesting to explore if you can extend this to support M.A.OpenAPI as well, and perhaps go even further and recommend some API changes to the underlying ApiExplorer metadata layer that both Swashbuckle/M.A.OpenAPI use for document generation to make your framework more resilient to whatever API document spec is supported.

I hope this information was helpful! If you have any inquiries about this work (particularly in regard to the more ASP.NET Core-related aspects), I'm happy to share insights.

Originally posted by @captainsafia in https://github.com/json-api-dotnet/JsonApiDotNetCore/issues/1046#issuecomment-2224206219

bkoelman commented 1 month ago

The Microsoft.AspNetCore.OpenApi implementation has evolved to cover some of the scenarios that you listed originally including...

These are great, thanks. The last one I don't understand, can you clarify? In addition to these, we'd need control over:

See here for details.

The biggest gap I noticed is around your schema generation-based customizations...

Yes, that's a big blocker. OpenApiSchemaService.GetOrCreateSchemaAsync is internal, we need to be able to override this. Not everything can be done afterward from options.SchemaTransformers. Aside from that, our code would become clumsy (reverse-engineer what this method does, then write code to revert that). Instead, we need an extensibility point that enables us to produce the OpenApiSchema ourselves. In there, we need access to the schema store built so far. We need to be able to look up by System.Type and by schema ID. We need to be able to add/remove/replace schemas in that store, because we have a use case that temporarily generates a schema, which is used as a template and then gets deleted (this isn't a global one, it occurs multiple times, see here). Our override needs to be able to call into the default generator for sub-schemas. Something like:

Task<OpenApiSchema> IOpenApiSchemaFactory.GetOrCreateSchemaAsync(Type, ApiParameterDescription?, OpenApiSchemaStore)

Assuming ASP.NET ships with implementation DefaultOpenApiSchemaFactory, we'd use:

services.AddSingleton<DefaultOpenApiSchemaFactory>();
services.AddSingleton<IOpenApiSchemaFactory, CustomOpenApiSchemaFactory>();

and inject DefaultOpenApiSchemaFactory into CustomOpenApiSchemaFactory so we can call into the default generator for subtrees. Then CustomOpenApiSchemaFactory can inspect the incoming System.Type and choose a strategy (sometimes using JsonSchemaExporter, which should publicly return JsonSchema instead of JsonNode). And there should be something provided to convert from JsonSchema to OpenApiSchema.

Oh, and OpenApiSchemaStore shouldn't be a singleton, because the output schema may vary depending on runtime context; OpenApiDocumentService.GetOpenApiDocumentAsync should create one and pass it through the pipeline of paths/operations/requests/responses etc. For example, a user authenticated as "admin" would get a richer openapi.json file. We have a case where we "abuse" the schema store to cache expensive lookups during generation, see here.

The other big challenge is moving lots of logic from schema generators to the ApiDescription level. This would be an improvement, because other tools like VS Endpoint Explorer and JetBrains Rider would also benefit from that. It's just a lot of work. And we'd need to find another place to store temporary state (it can't be HttpContext, because generation may run from MSBuild).

Do you use allOf to model inheritance hierarchies in your documents or is that something that you disable?

I suspect https://github.com/dotnet/aspnetcore/issues/56305#issuecomment-2225373840 answers your question. To get an impression of the openapi.json we generate, see the sample here. operationDiscriminator is that file is another case where we rely on inheritance (used for batch requests, spec at https://jsonapi.org/ext/atomic/).

What discriminators are you referring to in the statement above? At the moment, our implementation supports STJ's polymorphism attributes by default. Does your library provide a custom set?

We can't use attributes, the set of derived types depends on the runtime model. See here.

I assume you're doing this to work around the limitation in OpenAPI v3 that the JSON Schema const keyword is not supported?

No, this is unrelated to const. We do this because JSON:API requires to send "type": "people" as its discriminator. But we don't want to expose this magic string "people" to users, because it's an implementation detail of the wire format. By generating a single-valued enum property that is required and non-nullable, the compiler fills this in automatically. For example (using NSwag):

var updatePersonRequest = new UpdatePersonRequestDocument
{
    Data = new DataInUpdatePersonRequest
    {
        // omitted: Type = PersonResourceType.People,
        Id = "1",
        Attributes = new AttributesInUpdatePersonRequest
        {
            LastName = "Doe"
        }
    }
};

var response = await _apiClient.PatchPersonAsync(updatePersonRequest.Data.Id, updatePersonRequest);

This works because NSwag generates the following:

public enum PersonResourceType
{
    [System.Runtime.Serialization.EnumMember(Value = @"people")]
    People = 0,
}

public partial class DataInUpdatePersonRequest
{
    [Newtonsoft.Json.JsonProperty("type", Required = Newtonsoft.Json.Required.Always)]
    [System.ComponentModel.DataAnnotations.Required(AllowEmptyStrings = true)]
    [Newtonsoft.Json.JsonConverter(typeof(Newtonsoft.Json.Converters.StringEnumConverter))]
    public PersonResourceType Type { get; set; } = default!;

    // ...
}

Requiredness/nullability are modeled independently in our system so you should be able to model required properties that can be null if need be.

We have a case where it's impossible to determine upfront whether the property is nullable, see here, handled here. So we'll still need to be able to override that from code. Perhaps it can be worked around by having an additional UpdateNullableToOneRelationshipOperation<T>, but it wasn't worth the effort with Swashbuckle.

is prudent and wise. It seems like you've done a lot of work in the area already building our Swashbuckle's model and I wouldn't through that away. In the future...

Thanks for understanding. The thing is our team is very limited on resources. Everything is done in spare time. My current priority is finishing OpenAPI support, so we can release a new major version near the end of the year and drop .NET 6. I'd love to spend more time on M.A.OpenAPI, but it already comes at the cost of not being able to support the latest C# and EF Core features. These are hard tradeoffs, and nothing is set in stone. I think we'll release as JsonApiDotNetCore.OpenApi.Swashbuckle, so we can put another package aside later. For now, there's two big OpenAPI challenges left: 1. Support resource inheritance (ie model Car derives from concrete base type MotorVehicle, which derives from abstract type Vehicle, see here for details). 2. Support extensibility. Users can already plug in custom code at various places, for example to hide a field based on the time of day. We'll need to provide similar extensibility points for OpenAPI to match up their custom business logic.

I've taken a look at the preview 6 code, some remarks below.

  1. Semi-hardcoded defaults

    • OpenApiDocumentService.GetResponseAsync calls ReasonPhrases.GetReasonPhrase(statusCode) to determine description. This makes sense as a default, but we need it to be overridable. The dynamic description we generate lists possible causes and how to fix them (depending on which features are turned on).
    • OpenApiDocumentService.GetParameterDescriptionFromAttribute: We can't use DescriptionAttribute on types, because this information is determined based on path/verb. Some examples for the 'id' parameter:
      • "The identifier of the tag to retrieve."
      • "The identifier of the person whose related todoItems to retrieve."
      • "The identifier of the tag whose related todoItem identities to retrieve."
      • "The identifier of the person whose assignedTodoItems relationship to assign."
      • "The identifier of the person to add todoItems to."
      • "The identifier of the todoItem to delete."
    • Likewise, the description for the Query parameter lists the allowed features, based on endpoint and configured options. And the description for the header parameter lists which headers can be sent/received, based on the endpoint.
  2. OpenApiDocumentService.IsRequired

    • Should possibly also take BindRequiredAttribute into account?
    • And then there's MvcOptions.SuppressImplicitRequiredAttributeForNonNullableReferenceTypes and DataAnnotationsModelValidatorProvider.AddImplicitRequiredAttributeForValueTypes that may influence this.

Hope this helps.

bkoelman commented 1 month ago

Regarding allOf vs anyOf usage for inheritance: I discovered that Swashbuckle changed its default from allOf to anyOf not too long ago. However neither NSwag nor kiota function properly with anyOf, last time I checked. That's why I chose to use allOf. It's taken me days to get anything to work, because Swashbuckle allOf was completely broken at the time.

captainsafia commented 1 month ago

Yes, that's a big blocker. OpenApiSchemaService.GetOrCreateSchemaAsync is internal, we need to be able to override this. Not everything can be done afterward from options.SchemaTransformers.

Yep, this point is heard. And it's come up in other scenarios as well. At the moment, I'm not super eager to expose too much of M.A.OpenAPI's schema generation pipeline, particularly as we look towards a future where JSON schema is more unified in the OpenAPI specification (v3.1 and v4). In the long-term, I'd hope that most of these customizations are actually exposed at the lower level by System.Text.Json and it's JsonSchemaExporter APIs -- that way schemas for types are consistent across all implementations (validators, generators, etc.) and not just OpenAPI. With that in mind...

Instead, we need an extensibility point that enables us to produce the OpenApiSchema ourselves. In there, we need access to the schema store built so far. We need to be able to look up by System.Type and by schema ID. We need to be able to add/remove/replace schemas in that store, because we have a use case that temporarily generates a schema, which is used as a template and then gets deleted (this isn't a global one, it occurs multiple times, see here).

...out of curiousity, what does your schema generation do that existing implementations don't? Is this primarily about encoding information that goes beyond how System.Text.Json serializes things?

There's been discussion about exposing some sort of data-contract resolver layer in System.Text.Json where types could describe how their schemas should look. Would something like that work for you?

Oh, and OpenApiSchemaStore shouldn't be a singleton, because the output schema may vary depending on runtime context;

Hmmmm....I'd expect schemas to be an accurate reflection of the binding/serialization rules of an API. Is the most common scenario for this "hiding" priveliged schemas from non-admin users or are the mutations much more dynamic? How does this come into play if users are trying to feed the documents into spec testing/client gen tools? Do you always have to carry a user persona with the document you generate?

These are hard tradeoffs, and nothing is set in stone. I think we'll release as JsonApiDotNetCore.OpenApi.Swashbuckle, so we can put another package aside later.

This seems like a good choice. It also makes it clear what implementation you're taking a dependency on for those who care.

As I read more and more of the details of your implementation, I am inclined to think that exploring an implementation that leans more into ApiExplorer might be good to explore after you've completed the integration work with Swashbuckle.

OpenApiDocumentService.GetResponseAsync calls ReasonPhrases.GetReasonPhrase(statusCode) to determine description.

Operation transformers should help with the massaging here. There's also been disucssion about adding a Description property to IProducesResponseTypeMetadata to support customizations that will be reflected at the ApiExplorer metadata.

OpenApiDocumentService.GetParameterDescriptionFromAttribute: We can't use DescriptionAttribute on types, because this information is determined based on path/verb. Some examples for the 'id' parameter:

The DescriptionAttribute is one approach we support. In the future, we'll also support descriptions from XML documents. It seems like for your implementation you generate descriptions programmatically based on the schema? Similar to the previous point, operation transformers (like Swashbuckle's operation filters) should help here.

It might also be interesting to explore if we should add a property to ApiExplorer's ApiParameterDescription to help with this.

Should possibly also take BindRequiredAttribute into account?

ApiExplorer captures MVC's [BindRequired] attribute into the metadata representation (ApiParameterDescription.IsRequired) so we should be covered there.

And then there's MvcOptions.SuppressImplicitRequiredAttributeForNonNullableReferenceTypes and DataAnnotationsModelValidatorProvider.AddImplicitRequiredAttributeForValueTypes that may influence this.

Similar to the above, the ApiExplorer metadata layer should map these correctly into the ApiParameterDescription that is consumed by the OpenAPI layer, but I'll admit that I have to confirm this myself.

I discovered that Swashbuckle changed its default from allOf to anyOf not too long ago.

The PR that you linked seems to indicate that Swashbuckle is still using allOf to model inheritance. FWIW, our current implementation doesn't use allOf to model inheritance, primarily because STJ doesn't. Does your system take a dependency on allOf in some way?

bkoelman commented 1 month ago

At the moment, I'm not super eager to expose too much of M.A.OpenAPI's schema generation pipeline, particularly as we look towards a future where JSON schema is more unified in the OpenAPI specification (v3.1 and v4). In the long-term, I'd hope that most of these customizations are actually exposed at the lower level by System.Text.Json and it's JsonSchemaExporter APIs

I understand pushback on opening things up too early, which would require breaking changes later. However, I'm not convinced using JsonConverter.GetSchema will help us much, other than for primitive types (inline schemas). Personally, I avoid writing STJ converters as much as possible because custom non-primitive converters cannot report the correct error location. But more importantly, we'd need stateful converters (depending on the endpoint) with injected sub-generators. And a converter can't access parent or sibling schemas, which we'll definitely need. So exposing a schema store would be unavoidable.

...out of curiousity, what does your schema generation do that existing implementations don't? Is this primarily about encoding information that goes beyond how System.Text.Json serializes things?

In short: adapt the schema to make it JSON:API compliant (see https://jsonapi.org/ for an example), based on the runtime entity model and startup configuration. For example, JSON:API uses HATEOAS links (which obviously don't exist in entity models). Related entities come at the end of the response in an array, instead of inline. Related data is exposed as either a single object or an array, depending on whether its parent is a to-one or to-many relationship. There are specific rules about the meaning of null versus omitting something. Empty arrays aren't permitted, the outer element must be omitted instead. This is just off the top of my head, there must be hundreds of reasons why we can't just simply create a matching action method. I estimate we can move about 10% of our schema generation to ApiExplorer rewriting, even less to JsonConverter.GetSchema. The problem is that EF Core model classes and action methods are fundamentally differently structured than the JSON:API shape. That's why we need massive intervention at the root level, not minor tweaks.

There's been discussion about exposing some sort of data-contract resolver layer in System.Text.Json where types could describe how their schemas should look. Would something like that work for you?

Marginally. STJ is optimized for throughput and OAT, producing subtrees that are independent of their parents. In contrast, OpenAPI generation depends on runtime information and component schemas reference each other (even recursively, in the case of allOf inheritance). JSON schema typically uses oneOf for discriminated unions, which isn't available in C#. I've tried with NSwag, it just picks the first schema, ignoring the others. The fundamental mismatch between OOO (describe what's allowed) versus JSON schema (describe constraints) is somewhat eased by OpenAPI, for example by providing allOf inheritance with discriminators. There are often multiple ways to express something in JSON schema, so unifying the language constructs doesn't mean that some JSON schema would always resemble the schema that best fits OpenAPI.

Hmmmm....I'd expect schemas to be an accurate reflection of the binding/serialization rules of an API. Is the most common scenario for this "hiding" priveliged schemas from non-admin users or are the mutations much more dynamic? How does this come into play if users are trying to feed the documents into spec testing/client gen tools? Do you always have to carry a user persona with the document you generate?

The mutations are much more dynamic, because they originate from user code through our extensibility points. As far as I'm aware, it's possible in ASP.NET to add controllers at runtime (we don't use that), so I'd assume (haven't checked) that the information in ApiExplorer must be dynamic as well. But more importantly, a singleton schema store isn't thread-safe: if two concurrent requests render the openapi.json document, they'll influence each other. Our schema generators are singletons, yet they manipulate the passed-through schema store. It's not possible to add a transient dependency to a singleton, so where to store intermediate state then? HttpContext isn't always available, the current thread isn't safe with async/await, so use an AsyncLocal then? I'd imagine many developers will do it wrong. The great thing of passing the store is that our tests don't need to recreate the service collection all the time. In production, an output cache in front of the openapi endpoint could be used, although the Swashbuckle generation is extremely fast, compared to a regular API call that involves a database lookup.

As I read more and more of the details of your implementation, I am inclined to think that exploring an implementation that leans more into ApiExplorer might be good to explore after you've completed the integration work with Swashbuckle.

I agree, but it wouldn't help that much. We can prepare things like documentation that would otherwise be read from an attribute, but the bulk of the work needs to be done during schema generation. One case I haven't mentioned: we create a temporary schema store (not exposed to Swashbuckle) that we use internally to reuse (variants of) reference schemas. This works because the schema store holds just a simple dictionary, there are no parent/child object references.

The DescriptionAttribute is one approach we support. In the future, we'll also support descriptions from XML documents. It seems like for your implementation you generate descriptions programmatically based on the schema? Similar to the previous point, operation transformers (like Swashbuckle's operation filters) should help here.

We also use descriptions from XML comments (again, not in the regular way, because the object structure differs, so we parse the assembly.xml ourselves). The documentation I referred to doesn't exist in user code (ASP.NET controllers use a generic type parameter for id in a base class), so we generate boilerplate documentation. It would be possible to enable our users to adapt it, but they'd need to override all controller methods, which is unlikely.

You're right these can be solved using operation transformers. Additional properties to store these in ApiExplorer would be greatly appreciated.

ApiExplorer captures MVC's [BindRequired] attribute into the metadata representation (ApiParameterDescription.IsRequired) so we should be covered there. Similar to the above, the ApiExplorer metadata layer should map these correctly into the ApiParameterDescription that is consumed by the OpenAPI layer, but I'll admit that I have to confirm this myself.

I don't think that is correct. The information is not obtained from ApiExplorer, the code literally scans for the presence of RequiredAttribute here.

The PR that you linked seems to indicate that Swashbuckle is still using allOf to model inheritance. FWIW, our current implementation doesn't use allOf to model inheritance, primarily because STJ doesn't. Does your system take a dependency on allOf in some way?

Swashbuckle supports both, see the docs. None of the client generators I've tried work with oneOf inheritance. NSwag doesn't support it, and neither does kiota. Are you aware of any that do?