dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.17k stars 9.93k forks source link

[9.0-preview7] OpenAPI schemas are not stable between generations #56990

Closed martincostello closed 1 month ago

martincostello commented 1 month ago

Is there an existing issue for this?

Describe the bug

Similar to #56541, I've found a case where the generated OpenAPI schema changes when requested a second time.

The schema associated with an operation changes by having a 2 appended to it, and then a copy of an existing schema with that name is added to the components.

A diff of the change can be seen here: https://github.com/martincostello/aspnetcore-openapi/commit/966423f1cbcbcd5664f49063a89d4dd20b6a54ab

Expected Behavior

The OpenAPI document generated does not vary over the lifetime of the application.

Steps To Reproduce

  1. Clone https://github.com/martincostello/aspnetcore-openapi/commit/939259ef5591eb6b3b467248fe95479e42a9a937
  2. Run build.ps1 in the root of the repository
  3. Schema_Is_Correct() will fail for the Microsoft.AspNetCore.OpenApi generated document if the schema has already been requested by another test

Exceptions (if any)

No response

.NET Version

9.0.100-preview.7.24373.2

Anything else?

No response

martincostello commented 1 month ago

There's also quite a funky variant of this doing on in the sample app in this new repo I'm working on : https://github.com/martincostello/openapi-extensions/tree/main/samples/TodoApp

Refresh a bunch of times and suddenly...๐Ÿ˜…

image

My guess is this has something to do with the fact that I'm running transformations on the schemas (to add documentation and examples) which is busting some caching or something.

captainsafia commented 1 month ago

This one is....interesting ๐Ÿ˜… I tried to get a minimal repro going by removing elements in the full sample app that might be impacting this scenario:

So, there's no minimal repro quite yet. I'll need to log into my DevBox to debug into into the OpenAPI package with symbols to get a sense of what might be going on here.

martincostello commented 1 month ago

FYI this is still happening with the latest nightly builds of RC.1.

This looks like it might be a slow denial of service, as trying to do some benchmarks using one of my OpenAPI repos (see https://github.com/martincostello/openapi-extensions/commit/c3e2cb346f5d4b80204f676ebfab6613b20fe272) shows that the responses get slower and slower as the number of schemas grows.

WorkloadWarmup   1: 128 op, 1173016700.00 ns, 9.1642 ms/op
WorkloadWarmup   2: 128 op, 1335018100.00 ns, 10.4298 ms/op
WorkloadWarmup   3: 128 op, 1645448600.00 ns, 12.8551 ms/op
WorkloadWarmup   4: 128 op, 1465592100.00 ns, 11.4499 ms/op
WorkloadWarmup   5: 128 op, 1372219400.00 ns, 10.7205 ms/op
WorkloadWarmup   6: 128 op, 1381336200.00 ns, 10.7917 ms/op
WorkloadWarmup   7: 128 op, 1586216500.00 ns, 12.3923 ms/op
WorkloadWarmup   8: 128 op, 1711141400.00 ns, 13.3683 ms/op
WorkloadWarmup   9: 128 op, 1944725400.00 ns, 15.1932 ms/op
WorkloadWarmup  10: 128 op, 2236092300.00 ns, 17.4695 ms/op
WorkloadWarmup  11: 128 op, 2328593500.00 ns, 18.1921 ms/op
WorkloadWarmup  12: 128 op, 2507007000.00 ns, 19.5860 ms/op
WorkloadWarmup  13: 128 op, 2642792200.00 ns, 20.6468 ms/op
WorkloadWarmup  14: 128 op, 3247228100.00 ns, 25.3690 ms/op
WorkloadWarmup  15: 128 op, 3470480000.00 ns, 27.1131 ms/op
WorkloadWarmup  16: 128 op, 3666365700.00 ns, 28.6435 ms/op
WorkloadWarmup  17: 128 op, 4108643800.00 ns, 32.0988 ms/op
WorkloadWarmup  18: 128 op, 4259414600.00 ns, 33.2767 ms/op
WorkloadWarmup  19: 128 op, 4884142700.00 ns, 38.1574 ms/op
WorkloadWarmup  20: 128 op, 5182543800.00 ns, 40.4886 ms/op
WorkloadWarmup  21: 128 op, 5474030100.00 ns, 42.7659 ms/op
WorkloadWarmup  22: 128 op, 5551239800.00 ns, 43.3691 ms/op
WorkloadWarmup  23: 128 op, 6060406400.00 ns, 47.3469 ms/op
WorkloadWarmup  24: 128 op, 6811727800.00 ns, 53.2166 ms/op
WorkloadWarmup  25: 128 op, 7859329800.00 ns, 61.4010 ms/op
WorkloadWarmup  26: 128 op, 9562745500.00 ns, 74.7089 ms/op
WorkloadWarmup  27: 128 op, 9492063500.00 ns, 74.1567 ms/op

I haven't tested it to destruction yet to see what happens...

martincostello commented 1 month ago

I had a go at testing it to destruction - interestingly, rather than getting to an OOM condition, instead it starts throwing exceptions after enough requests:

image

fail: Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware[1]
      An unhandled exception has occurred while executing the request.
      System.ArgumentException: An item with the same key has already been added. Key: TodoListViewModel7096
         at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
         at Microsoft.AspNetCore.OpenApi.OpenApiSchemaReferenceTransformer.TransformAsync(OpenApiDocument document, OpenApiDocumentTransformerContext context, CancellationToken cancellationToken)
         at Microsoft.AspNetCore.OpenApi.OpenApiDocumentService.ApplyTransformersAsync(OpenApiDocument document, CancellationToken cancellationToken)
         at Microsoft.AspNetCore.OpenApi.OpenApiDocumentService.GetOpenApiDocumentAsync(CancellationToken cancellationToken)
         at Microsoft.AspNetCore.Builder.OpenApiEndpointRouteBuilderExtensions.<>c__DisplayClass0_0.<<MapOpenApi>b__0>d.MoveNext()
      --- End of stack trace from previous location ---
         at Microsoft.AspNetCore.Http.Generated.<GeneratedRouteBuilderExtensions_g>F56B68D2B55B5B7B373BA2E4796D897848BC0F04A969B1AF6260183E8B9E0BAF2__GeneratedRouteBuilderExtensionsCore.<>c__DisplayClass2_0.<<MapGet0>g__RequestHandler|5>d.MoveNext()
      --- End of stack trace from previous location ---
         at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
         at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddlewareImpl.Invoke(HttpContext context)
captainsafia commented 1 month ago

@martincostello Thanks for sharing these updates! I ultimately was able to attach a debugger to your sample app and was able to identify the core issue: it's a variant of https://github.com/dotnet/aspnetcore/issues/56541 that happens because the copy-constructor used here doesn't utilize a deep copy for deepily nested schemas (in this case a sub-schema of items within properties).

The more I noodle on this, the more I think we don't get a ton of value at the moment of the schema store being a singleton-per-document. Particularly because we expect it to be idempotent between requests most of the time (your schemas don't change between documents) and we're not exposing it as a public API at the moment.

martincostello commented 1 month ago

So is the fix here to just create the registry per request? I didn't quite follow the consequences of the second paragraph.

Once there's a fix for this in main I'll look at re-running my comparison benchmarks, as the number of things to transform getting larger on every request is skewing the numbers ๐Ÿ˜…

captainsafia commented 1 month ago

So is the fix here to just create the registry per request? I didn't quite follow the consequences of the second paragraph.

Ignore my second paragraph! ๐Ÿ˜… The only thing we need to do her is implement deep copies when we map the in-memory schema store cache to the top-level components.schemas property in the document. I tested the changeset in https://github.com/dotnet/aspnetcore/pull/57223 on both martincostello/aspnetcore-openapi and martincostello/TodoApi and I am getting the right behavior here.

martincostello commented 1 month ago

Once there's a fix for this in main I'll look at re-running my comparison benchmarks

https://github.com/dotnet/aspnetcore/issues/57211#issuecomment-2277348167