OData / WebApi

OData Web API: A server library built upon ODataLib and WebApi
https://docs.microsoft.com/odata
Other
854 stars 476 forks source link

InvalidOperationException thrown from CreateRequestContainer cause performance issue #2832

Open fannydengdeng opened 5 months ago

fannydengdeng commented 5 months ago

As brought up by #1616, CreateRequestContainer throws an InvalidOperationException when it's called multiple times. In the case of multiple routes being matched by the RouteConstraintMatcher for multiple ApiVersions, the exception here gets thrown if a higher version is requested, and it eventually gets swallowed. Even though it doesn't cause the request to fail, the problem is we have found it has caused an increased in lock contention and CPU usage in one of our high traffic services due to excess cost of the exceptions being thrown.

Assemblies affected

Microsoft.AspNetCore.OData 7.5.17

Reproduce steps

  1. Configure multiple routes using the ApiVersioning library, for example: https://github.com/dotnet/aspnet-api-versioning/blob/v4.1.0/samples/aspnetcore/ODataBasicSample/Configuration/PersonModelConfiguration.cs
  2. Set a breakpoint at the location where the InvalidOperationException is thrown
  3. Make a request to the service that requests the higher version
  4. Expect CreateRequestContainer to be invoked for each versioned route starting from the lowest version

Expected result

No InvalidOperationException is thrown at all.

Actual result

InvalidOperationException is thrown then swallowed.

Additional detail

Here are some stack details:

Exception stack

Microsoft.AspNetCore.Builder.RouterMiddleware!Invoke
System.Runtime.CompilerServices.AsyncMethodBuilderCore!Start
Microsoft.AspNetCore.Builder.RouterMiddleware+<Invoke>d__4!MoveNext
Microsoft.AspNetCore.Routing.RouteCollection!RouteAsync
System.Runtime.CompilerServices.AsyncMethodBuilderCore!Start
Microsoft.AspNetCore.Routing.RouteCollection+<RouteAsync>d__10!MoveNext
Microsoft.AspNetCore.Routing.RouteBase!RouteAsync
Microsoft.AspNetCore.Routing.RouteConstraintMatcher!Match
Microsoft.AspNet.OData.Routing.VersionedODataPathRouteConstraint!Match
Microsoft.AspNet.OData.Routing.ODataPathRouteConstraint!Match
Microsoft.AspNet.OData.Extensions.HttpRequestExtensions!CreateRequestContainer
coreclr.dll!IL_Throw
coreclr.dll!RaiseTheExceptionInternalOnly
KernelBase.dll!RaiseException
ntdll.dll!RtlRaiseException
ntdll.dll!RtlDispatchException
ntdll.dll!RtlpExecuteHandlerForException
coreclr.dll!ProcessCLRException
coreclr.dll!ExceptionTracker::ProcessOSExceptionNotification
coreclr.dll!ExceptionTracker::ProcessManagedCallFrame
coreclr.dll!`coreclr_initialize'::`13'::`dynamic atexit destructor for 'bundle''
coreclr.dll!FireEtwExceptionThrown_V1
coreclr.dll!McTemplateCoU0zzpqhh_EventWriteTransfer
coreclr.dll!McGenEventWrite_EventWriteTransfer
ntdll.dll!EtwEventWriteTransfer
ntdll.dll!EtwpEventWriteFull
ntdll.dll!ZwTraceEvent
System.InvalidOperationException

Lock contention from walking the stack, which could be due to the high rate of exceptions.

ntoskrnl.exe!RtlBackoff
ntoskrnl.exe!ExfAcquirePushLockSharedEx
ntoskrnl.exe!ExAcquirePushLockSharedEx
ntoskrnl.exe!RtlpLookupUserFunctionTable
ntoskrnl.exe!RtlpLookupFunctionEntryForStackWalks
ntoskrnl.exe!RtlpWalkFrameChain
ntoskrnl.exe!RtlWalkFrameChain
ntoskrnl.exe!EtwpTraceStackWalk
ntoskrnl.exe!EtwpStackWalkApc
ntoskrnl.exe!KiDeliverApc
ntoskrnl.exe!KiApcInterrupt
ntoskrnl.exe!KeYieldExecution
ntoskrnl.exe!NtYieldExecution
xuzhg commented 5 months ago

@https://github.com/commonsensesoftware

@commonsensesoftware

commonsensesoftware commented 5 months ago

@fannydengdeng some initial questions:

  1. Which version of API Versioning are you actually using? a. You cited 4.1, which is very old (in software age 😉)
  2. Is there some reason you're still using OData 7.x and .NET Core 3.1? a. .NET Core 3.1 is end of life and no longer supported. b. Asp.Versioning.OData 6.0 (the successor to Microsoft.AspNetCore.OData.Versioning) and OData 8.0.2 would be the furthest back I would go with support for .NET Core 3.1, but you can go up to 6.4

CreateRequestContainer has affinity to HttpRequest as can be seen in the definition of its extension method. API Versioning computes all of its metadata and EDMs for OData before any requests are invoked. API Versioning doesn't call CreateRequestContainer directly (that I can find or recall).

One possibility, but it's just a theory, is that the request pipeline is being re-entered. Since this is still OData 7.x, it is using the legacy IRouter implementation for the routing system. It is possible for the matching process to make more than one matching pass. This was a systemic problem I worked with the ASP.NET team on when they were designing Endpoint Routing (which requires OData 8.0+). This might also happen if you have fallback middleware. That shouldn't be used for APIs IMHO, but to each their own. FYI, if that is your scenario.

A possible workaround around would be to clear the OData feature. If you have retries or fallbacks, that can be done at the start of the pipeline. If it's from re-entering route matching, it may have to be done elsewhere in the pipeline. I'm not 100% sure where. Clearing the feature is simple enough.

app.Use((context, next) => {
    context.Features.Set(default(IODataFeature));
    return next(context);
});

@xuzhg I did some digging and I can't find anywhere that API Versioning calls CreateRequestContainer. Frankly, I wouldn't expect it to. There are several cases of calling GetODataRootContainer, but I don't see anything that calls through. GetODataRootContainer is idempotent.

xuzhg commented 5 months ago

@commonsensesoftware Thanks for quickly digging into. Very appreciate it.

fannydengdeng commented 5 months ago

Thank you for the quick investigation!

Which version of API Versioning are you actually using?

We are using API Versioning 4.1 and MVC for routing.

Is there some reason you're still using OData 7.x and .NET Core 3.1?

Our services are on .NET 6 and have not migrated from OData 7.x to 8.x yet. Hence the need to stay on 4.1.

Yes from debugging I was seeing the route matching being re-entered for a single request (not a retry), though I don't know why. The service that was taking the performance hit has put in a workaround to sort the EdmModels by most frequently used ApiVersion during route registration so it alleviates the pressure for now.

We have plans to move to use endpoint routing once https://github.com/dotnet/aspnet-api-versioning/issues/1012 is sorted out. Or go directly to OData 8.x.

commonsensesoftware commented 5 months ago

@fannydengdeng,

The common thing that can cause re-entry into the routing system is a variance of route constraints in the route templates between API versions. For example, if you had /orders/{id}?api-version=1.0 and /orders/{id:int}?api-version=2.0, you'll observe SelectBestCandidates being hit more than once. I don't recall the order off-hand (it's been several years), but you should see it happen, at least, twice. API Versioning had to bootstrap a special catch-all route to deal with that behavior or it could make the wrong decision too early.

Are you somehow connected to dotnet/aspnet-api-versioning#1012? If not, you shouldn't be waiting. I'm waiting on a repro before I even consider whether any work will be done. All versions < 6.0 are deprecated and I only service them for critical bugs - sometimes. 8.0 is the most current release. API Versioning follows the official .NET support policy. .NET 6.0 support will end in November of this year.

If you're targeting .NET 6, then you should be using Asp.Versioning.OData 6.4.0. ASP.NET Core has an affinity to the .NET runtime. Using libraries that target earlier versions of .NET (e.g. .NET 5.0 or .NET Core 3.1) are not guaranteed to be compatible; hence, the major version bump. API Versioning does not officially support that type of version mixing and matching. I have witnessed runtime failures such as the dreaded MissingMethodException from those that have attempted to do so. Proceed down this path with extreme caution!

If you have more questions or comments on dotnet/aspnet-api-versioning#1012, please add them there. If you have more questions or need help in your transition, feel free to start a discussion or open an issue in the API Versioning repo.

fannydengdeng commented 5 months ago

Interestingly, SelectBestCandidate was only hit once during my debugging session. Here are some additional observations:

Question: is it expected that route template is "{*odataPath}" for all the routes?

This is how we set up the routing

var models = new EdmModelBuilder().CreateEdmModels().Values; // collection of models

app.UseMvc(builder =>
{
    builder.ServiceProvider.GetRequiredService<ODataOptions>().UrlKeyDelimiter = Parentheses;
    builder.MapVersionedODataRoutes("odata", string.Empty, models);
});

As for the transition, yes I'm connected to https://github.com/dotnet/aspnet-api-versioning/issues/1012. We have not been able to prioritize upgrading to OData 8.x hence we can't yet use Asp.Versioning.OData 6.4.0 which depends on OData 8.x.

commonsensesoftware commented 5 months ago

Question: is it expected that route template is "{*odataPath}" for all the routes?

Yes. OData < 8.0 uses its own matching process and routing conventions. ASP.NET Core sees this as the catch-all route template {*odataPath}, which greedily consumes the rest of the path. OData will convert that into dynamic Endpoint instances.

If you're able to post the simplest repro for dotnet/aspnet-api-versioning#1012 or for this issue, I'm happy to investigate further. I've been unable to repro the scenario myself.

fannydengdeng commented 5 months ago

Thank you @commonsensesoftware. Here is a simple repro for this issue https://github.com/fannydengdeng/ODataApiVersioning/tree/main/ODataApiVersioning

commonsensesoftware commented 4 months ago

After digging into this more, I understand what is happening. Apologies for not reading the OP more carefully. The issue was outlined pretty clearly.

@xuzhg The issue is related to a First-Chance Exception. No exception is actually observed, but as @fannydengdeng points out, try...catch is not cheap and can have a performance impact. Her scenario seems to be significantly impacted by it.

I understand why CreateRequestContainer is not idempotent. Arguably, it shouldn't be it. It's a violation of POLA to invoke a method called CreateXXX that doesn't actually create anything. I also see that, where this is being used, you can't use GetRequestContainer yet because the route name hasn't been resolved. If you are going fix/patch this, then my recommendation would be to introduce a new, internal method that is idempotent for this edge case. Something like:

internal static IServiceProvider GetRequestContainer(this HttpRequest request, string routeName) =>
    request.ODataFeature().RequestContainer ?? request.CreateRequestContainer(routeName);

This would address the issue without adding anything new to the public surface area (e.g. it can ship in a patch).

Ultimately, this issue is only indirectly related to API Versioning. OData should be handling this scenario more gracefully. API Versioning does not observe or handle the exception (that I've found). This issue does not exist in OData 8.0+. It's your call to service the change and in which versions.