Open EikeSchwass opened 1 year ago
The naive approach would be to introduce a rate limiting middleware, but hard coding the number of concurrent requests that are allowed seems problematic. Is there a way to configure ASP.NET Core so that it starts to throttle/rate limit the number of requests as it approaches thread starvation?
This is the right direction and yes hardcoding a number is not great but it's what ASP.NET did (and HTTP.sys and the layers beneath). .NET 7 has better options for rate limiting (other than just concurrency and can do it per endpoint https://devblogs.microsoft.com/dotnet/announcing-rate-limiting-for-dotnet/).
If you want an idea of some of the existing numbers for .NET Framework:
Throttling incoming requests to blocking endpoints is definitely the way to go here.
We assume we would need something like this: https://referencesource.microsoft.com/#System.Web/RequestQueue.cs,8
That isn't being used by ASP.NET. It's an older, less efficient queue that was used prior to it moving to native code.
@davidfowl thanks for the quick response. What is a good estimate for the total number of concurrent requests? How did ASP.NET decide how many it let through?
What is a good estimate for the total number of concurrent requests?
There's no good number and it's hard to bake a number into the framework. Applications have a much easier time with it because they can optimize for a specific load profile. Doing it in the server or framework means we need to make assumptions about the load profile of any application.
How did ASP.NET decide how many it let through?
Load testing on some specific scenarios and some guesstimating.
Applications that pick a number usually find the breaking point of the application by driving load to it and then observing metrics. Once you figure out where it breaks then reduce the concurrency number until the performance is reasonable.
I'd recommend driving load and observing metrics with https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-counters.
There's a high-level tutorial here https://learn.microsoft.com/en-us/dotnet/core/diagnostics/event-counter-perf
Here is the list of well-known counters https://learn.microsoft.com/en-us/dotnet/core/diagnostics/available-counters
If you want a more sophisticated load tool then consider https://github.com/dotnet/crank (it's possible to make it work with IIS as well but it's not documented right now). This tool can drive load and also collect counters. Start simple and see if you can look at the counters locally while reproducing the issue on IIS.
@davidfowl The ASP.NET Version of our app must have the maximum concurrent requests set somewhere though right? I assumed the numer is baked in ASP.NET somewhere and for starters I would simply like to copy the limit directly to our Core Version. We didn't configure anything in that regard for the Framework version and it does throttle appropriately somehow.
The ASP.NET Version of our app must have the maximum concurrent requests set somewhere though right?
That's what I specified in the last message:
System.Web request queue - 5000 * number of CPUs
5000 * Environment.ProcessorCount
I believe there's also a concurrent request limit in IIS appConcurrentRequestLimit
that's 5000 by default (I'm not sure if that's per CPU).
The HTTP.sys queue is 1000 not 5000 (I tweaked it).
@davidfowl ah sorry I misunderstood. I thought that referred to the maximum queue length and not the maximum concurrent calls. Thanks for clearing that up! This has helped tremendously! <3
I updated the issue with the relevant settings in case you want to do more research. Let me know how it turns out.
@davidfowl completely eliminated the problem, so now only fine tuning is left. Thanks again!
@EikeSchwass Can you share your middleware configuration here to help future developers 😄 ?
@davidfowl sure!
We used Microsoft.AspNetCore.ConcurrencyLimiter
In our Startup.cs:
public void ConfigureServices(IServiceCollection services)
{
// ...
services.AddStackPolicy(options =>
{
options.RequestQueueLimit = 5000 * Environment.ProcessorCount;
options.MaxConcurrentRequests = Configuration.MaxConcurrentRequests * Environment.ProcessorCount;
});
// ...
}
public void Configure(IApplicationBuilder app, IWebHostEnvironment env, IHostApplicationLifetime appLifetime)
{
// ... (no other middlewares)
app.UseConcurrencyLimiter();
// ..
}
and our appsettings.config:
{
"..."
"MaxConcurrentRequests": "15"
"..."
}
However, this value will most likely change as we do more testing. Nevertheless is did fix the issue for our TEST environment.
Notice that the value gets multiplied by Environment.ProcessorCount
.
I want to turn this into guidance.
@BrennanConroy this is a great use of the new rate limiting APIs
Thanks for contacting us.
We're moving this issue to the .NET 8 Planning
milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.
Since you mentioned Oracle - if you are using MySQL, and using the Oracle driver, do yourself a favor and switch to the open source one. We also experienced random complete process lockups (that did not recover) using the Oracle driver, even when all calls are async. Not a single hang since switching to MySqlConnector with months and months of uptime.
Unfortunately the completely worthless quality of Oracle's drivers is such a problem for the .net community since so many people use MySQL and probably assume that it's dotnet's fault when the entire process just stops working. Oracle's connectors should be blacklisted from Nuget....
Tangential but very relevant to the genesis of this thread: Oracle.ManagedDataAccess
version 23+ allegedly (I'm going by release notes here, not personal usage) has support for async. However, the v23 drivers are (at time of writing) do not seem to be fully released, with 21.12.0 the most recent without the -dev
suffix.
We are facing a very similar issue and are about to test the suggested approach in this thread. We would have appreciated this problem to be more prevalent in the official documentation /guidance as to how to migrate 👍
Is there an existing issue for this?
Describe the bug
We are migrating a large code base from .Net Framework 4.7.2 ASP.NET to .NET 6 ASP.NET Core (Hosted in IIS 10).
Unfortunately we noticed a regression in high load scenarios. The ASP.NET was able to recover from load peaks, while ASP.NET Core enters thread starvation and only recovers if load is reduced and even then it takes minutes. Due to Oracle not providing a true async API, a large part of our code base runs synchronously.
Our current hypothesis for the difference is the missing default request queue that was present in ASP.NET (?). The naive approach would be to introduce a rate limiting middleware, but hard coding the number of concurrent requests that are allowed seems problematic. Is there a way to configure ASP.NET Core so that it starts to throttle/rate limit the number of requests as it approaches thread starvation?
We assume we would need something like this: https://referencesource.microsoft.com/#System.Web/RequestQueue.cs,8
Expected Behavior
ASP.NET Core should not allow overloading itself with requests and instead buffer them in that case similar how ASP.NET + IIS did it.
Steps To Reproduce
Exceptions (if any)
No response
.NET Version
6.0.403
Anything else?
We gathered some information from https://developercommunity.visualstudio.com/t/on-net-core-timeout-in-large-concurrency/693778#T-N694931