dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.42k stars 10.01k forks source link

App first request performance with many Minimal APIs #46313

Open JamesNK opened 1 year ago

JamesNK commented 1 year ago

Is there an existing issue for this?

Describe the bug

Found while investigating https://github.com/dotnet/aspnetcore/issues/46154

The app below has 30,000 endpoints. The first request to / takes 0.5 seconds with 280 MB memory usage:

using System.Diagnostics;
using System.Globalization;
using System.Reflection;
using Microsoft.AspNetCore.Http.HttpResults;
using Microsoft.AspNetCore.Http.Metadata;
using Microsoft.AspNetCore.Mvc;

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();

app.Use(async (HttpContext context, Func<Task> next) =>
{
    Console.WriteLine("Start time");
    Stopwatch stopwatch = Stopwatch.StartNew();

    await next();

    stopwatch.Stop();
    Console.WriteLine(stopwatch.Elapsed.TotalSeconds);
});

app.UseRouting();

Task Plaintext(HttpContext context) => context.Response.WriteAsync("Hello, World!");
for (int i = 0; i < 30_000; i++)
{
    var url = "/plaintext/nested" + i;
    app.MapGet(url, Plaintext);
}

app.MapGet("/", (HttpContext context) =>
{
    return context.Response.WriteAsync("Hello world");
});

Console.WriteLine("Running app");
app.Run();

If I change the Plaintext endpoint to be a minimal API (aka use RequestDelegateFactory) like so:

- Task Plaintext(HttpContext context) => context.Response.WriteAsync("Hello, World!");
+ string Plaintext() => "Hello, World!";

With 30,000 minimal APIs, it now takes 32 seconds to get the first request. And memory usage is 1,065 MB.

Expected Behavior

I expect a fast startup time.

I think the problem is RequestDelegateFactory is building and compiling expressions for all minimal API endpoints when routing starts. Creating a minimal API's expression should be lazy and wait until an endpoint is first visited.

Steps To Reproduce

No response

Exceptions (if any)

No response

.NET Version

No response

Anything else?

No response

JamesNK commented 1 year ago

cc @davidfowl

davidfowl commented 1 year ago

Yea, this is known, and we could definitely do a better job here. Also the above code isn't super realistic, maybe our benchmark should have different API shapes with argument types to better represent something more realistic. The reason I suggest that is because I can see 2 optimizations:

  1. Making delegate compilation lazy
  2. Caching similar delegates to reduce repeated compilation (would work great in the above scenario).
DamianEdwards commented 1 year ago

Didn't we have a test set of routes/APIs based on all the public Azure or MS REST APIs at one point? I seem to recall we used it to help determine endpoint routing performance in 3.0.

JamesNK commented 1 year ago

I think making delegate compilation lazy is the easiest fix here.

var buildExpressionLazy = new Lazy<RequestDelegate>(Builder, LazyThreadSafetyMode.ExecutionAndPublication);
RequestDelegate d = (context) =>
{
    var realDelegate = buildExpressionLazy.Value;
    return realDelegate(context);
};
return d;

// + Build method which handles calling off to RequestDelegateFactory to generate compiled expression.

Caching similar delegates is useful but could come later.

davidfowl commented 1 year ago

I agree, it doesn't solve the peak memory problem, but it does make it pay as you go.

halter73 commented 1 year ago

Lazily calling RequestDelegateFactory.Create has the possible issue that this will surface some errors later that used to happen on the first request. It also could introduce subtle ordering issues for filter factories and complicated IEndpointConventionBuilder.Add callbacks if they rely on some sort of global state.

I don't find this super likely to cause issues though, and the fact that the errors were never truly at startup and instead during first request might make the impact of this less significant. More analyzers that highlight these errors at compilation time will also mitigate this.

Caching similar RequestDelegates has some potential. Scenarios like this where endpoints are added programmatically or during source gen seem common enough. A developer could call RequestDelegateFactory.Create() and cache themselves if they start noticing startup issues, but that'd be extremely unobvious.

captainsafia commented 1 year ago

Triage: We need to identify what in the startup codepath is causing this (compiling LINQ expressions, something else?)

We'll also want to test how the source generated-endpoints impact this behavior.

captainsafia commented 1 year ago

Triage: We need to identify what in the startup codepath is causing this (compiling LINQ expressions, something else?)

We'll also want to test how the source generated-endpoints impact this behavior.

Follow up with some numbers from a run I did:

Start time (with RequestDelegate)
0.27807

Start time (with route handler)
1.2106882

Start time (with route handler and generator)
0.7631818

Looks like there's something else producing overhead in the source generator scenario...

davidfowl commented 1 year ago

The dictionary lookups?

DamianEdwards commented 1 year ago

Aren't those per-request? Or are they when the endpoint is built?

davidfowl commented 1 year ago

It’s during startup (the calls the Map*)