dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.16k stars 9.92k forks source link

Allow server Memory Pool to shrink #27394

Open mkArtakMSFT opened 3 years ago

mkArtakMSFT commented 3 years ago

Summary

Kestrel currently doesn't use the normal memory pool. It uses a byte- array and keep expanding it - without ever shrinking. This issue is about coming up with good logic about when the memory pool should shrink and how.

People with more context

@halter73, @shirhatti @davidfowl

Motivation and goals

Today the server implementations in ASP.NET Core (Kestrel, IIS, and HTTP.sys) do not use the ArrayPool, it uses custom pool called the SlabMemoryPool. The buffers are pinned because they are used for IO (mostly pinvoke layers). We rarely pin user provided buffers and can generally avoid fragmentation by pinning up front for the lifetime of the application (at least that was the idea).

This pool allocates slabs of memory 128K on the POH and slices them into 32 4K blocks (aligned 4K blocks). If there are no free blocks, a new 128K slab is allocated (32K more blocks). Before the POH it used the 128K large allocation to get the byte[] into the LOH.

Now for the big problem:

ASP.NET Core tries its best to avoid holding onto buffers from the pool for an extended period as best it can. It does this by delaying the allocation until there's data to be read from the underlying IO operation (where possible). This helps but doesn't solve the memory problem in a bunch of cases:

Traffic spikes result in allocating a bunch of memory in the MemoryPool that never gets removed. This is beginning to show up more in constrained container scenarios where memory is limited.

The goal is to reduce memory consumption when not at peak load.

Risks / unknowns

This is hard to get right and could become a configuration nightmare if we can't do enough automatically. It could also regress performance if we need to "collect memory" on the allocation path or any hot path.

ghost commented 3 years ago

Thanks for contacting us. We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

ghost commented 3 years ago

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

davidfowl commented 1 year ago

We’ve survived long without this because we keep making improvements to avoid allocating until there’s data. Most affected scenario might be large buffered requests/responses (not our default because Ststem.Text.Json handles this well).

valentk777 commented 1 year ago

still reproducible when working with files (like pdf's) and getting or sending byte arrays

davidfowl commented 1 year ago

@valentk777 Can you share your scenario and the profile you're looking at?

halter73 commented 1 year ago

still reproducible when working with files (like pdf's) and getting or sending byte arrays

@valentk777 This makes sense, but it should be mitigated by chunking up the writes/flushes when there's a large file or byte array. Are you seeing this with the built-in StaticFileMiddleware?

amcasey commented 5 months ago

This hit appservice yesterday after a request spike and required some VM reboots to clean up. They're going to try to find more data on how prevalent it is.

davidfowl commented 5 months ago

VM reboot? Can just restart the app right? At a minimum can we add telemetry here? Right now this is a bit of a black box (using the POH as an approximation)

depler commented 4 months ago

Seems to be related: https://github.com/dotnet/aspnetcore/issues/55490