dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.28k stars 4.74k forks source link

System.Net.Http.CreditWaiter excessive memory consumption #50495

Closed antoniofreire closed 3 years ago

antoniofreire commented 3 years ago

Description

While profiling an application during a sudden and constant increase in memory usage i've discovered that a single instance of CreditWaiter was retaining 4GB of memory.

image

image

It appears that most of the consumption comes from the stacktrace strings retained by the CreditWaiter:

String duplicates inspection

Wasted: 1.513.826.492
Objects count: 415.203
Length: 1.812
String value:
   at System.Net.Http.CreditWaiter.<>c.<RegisterCancellation>b__7_0(Object s)
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
   at System.Net.Http.Http2Connection.Http2Stream.OnReset(Exception resetException, Nullable`1 resetStreamErrorCode, Boolean canRetry)
   at System.Net.Http.Http2Connection.ProcessRstStreamFrame(FrameHeader frameHeader)
   at System.Net.Http.Http2Connection.ProcessIncomingFramesAsync()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task`1.TrySetResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetExistingTaskResult(Task`1 task, TResult result)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.InvokeContinuation(Action`1 continuation, Object state, Boolean forceAsync, Boolean requiresExecutionContextFlow)
   at System.Net.Sockets.SocketAsyncEngine.System.Threading.IThreadPoolWorkItem.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
--- End of stack trace from previous location ---

Domination path for 'System.Net.Http.CreditWaiter' from 'Inspection
String duplicates
"   at System.Net.Http.CreditWaiter.<>c.<RegisterCancellation>b__7_0(Object s)⏎   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)⏎   at System.Net.Http.Http2Connection.Http2Stream.OnReset(Exception resetException, Nullable`1 resetStreamErrorCode, Boolean canRetry)⏎   at System.Net.Http.Http2Connection.ProcessRstStreamFrame(FrameHeader frameHeader)⏎   at System.Net.Http.Http2Connection.ProcessIncomingFramesAsync()⏎   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)⏎   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)⏎   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)⏎   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)⏎   at System.Threading.Tasks.Task`1.TrySetResult(TResult result)⏎   at System.Runtime.CompilerServices.AsyncTas…"
'
Retains 1,41 GB in 415,2 K objects

System.Net.Http.CreditWaiter

Configuration

Other information

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
### **Description** While profiling an application during a sudden and constant increase in memory usage i've discovered that a single instance of CreditWaiter was retaining 4GB of memory. ![image](https://user-images.githubusercontent.com/2423234/113159670-2e139000-9213-11eb-9c50-917175af62e5.png) ![image](https://user-images.githubusercontent.com/2423234/113159809-497e9b00-9213-11eb-89f9-f7c7a392160c.png) It appears that most of the consumption comes from the stacktrace strings retained by the CreditWaiter: ``` String duplicates inspection Wasted: 1.513.826.492 Objects count: 415.203 Length: 1.812 String value: at System.Net.Http.CreditWaiter.<>c.b__7_0(Object s) at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException) at System.Net.Http.Http2Connection.Http2Stream.OnReset(Exception resetException, Nullable`1 resetStreamErrorCode, Boolean canRetry) at System.Net.Http.Http2Connection.ProcessRstStreamFrame(FrameHeader frameHeader) at System.Net.Http.Http2Connection.ProcessIncomingFramesAsync() at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread) at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining) at System.Threading.Tasks.Task.RunContinuations(Object continuationObject) at System.Threading.Tasks.Task`1.TrySetResult(TResult result) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetExistingTaskResult(Task`1 task, TResult result) at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.InvokeContinuation(Action`1 continuation, Object state, Boolean forceAsync, Boolean requiresExecutionContextFlow) at System.Net.Sockets.SocketAsyncEngine.System.Threading.IThreadPoolWorkItem.Execute() at System.Threading.ThreadPoolWorkQueue.Dispatch() --- End of stack trace from previous location --- Domination path for 'System.Net.Http.CreditWaiter' from 'Inspection String duplicates " at System.Net.Http.CreditWaiter.<>c.b__7_0(Object s)⏎ at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)⏎ at System.Net.Http.Http2Connection.Http2Stream.OnReset(Exception resetException, Nullable`1 resetStreamErrorCode, Boolean canRetry)⏎ at System.Net.Http.Http2Connection.ProcessRstStreamFrame(FrameHeader frameHeader)⏎ at System.Net.Http.Http2Connection.ProcessIncomingFramesAsync()⏎ at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)⏎ at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)⏎ at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)⏎ at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)⏎ at System.Threading.Tasks.Task`1.TrySetResult(TResult result)⏎ at System.Runtime.CompilerServices.AsyncTas…" ' Retains 1,41 GB in 415,2 K objects System.Net.Http.CreditWaiter ``` ### *Configuration* - Which version of .NET is the code running on? dotnet 5.0.4 - What OS and version, and what distro if applicable? - Docker on Linux, mcr.microsoft.com/dotnet/aspnet:5.0 base image - What is the architecture (x64, x86, ARM, ARM64)? - x64 ### *Other information* - I think that this is a memory leak, since the memory usage grows indefinitely
Author: antoniofreire
Assignees: -
Labels: `area-System.Net.Http`, `untriaged`
Milestone: -
stephentoub commented 3 years ago

Thanks for the report. Could you share a repro?

geoffkizer commented 3 years ago

@stephentoub I'm wondering about CreditWaiter.ResetForAwait... should this be unregistering the cancellation token registration? Dispose calls UnregisterAndOwnCompletion, but ResetForAwait doesn't.

MihaZupan commented 3 years ago

Isn't ResetForAwait always called after TrySetResult, which in turn calls UnregisterAndOwnCompletion?

stephentoub commented 3 years ago

I suspect this isn't actually a bug in CreditWaiter but rather a sign of something else going on in the app.

Looking at the upper-left corner of the screenshot, it shows this: image It doesn't show the full method name, but there are only two methods on Http2Connection that begin with "SendS", "SendSettingsAckAsync" and "SendStreamDataAsync". Unlikely to be the former, so I'll assume the latter. Back of the napkin calculation just from looking at the compiled IL, and the AsyncStateMachineBox for SendStreamDataAsync is probably going to be ~168 bytes. The data shows 141.42MB of these, which would mean almost 900K instances of SendStreamDataAsync.

Similarly, it shows memory data for OperationCanceledException which suggests there are almost 800K of them.

Similarly, data for ExceptionDispatchInfo which suggests there are ~1.5M of them.

And, the string data shows ~1M strings all for the same stack trace.

I think what we're actually seeing here is around 1M concurrent HTTP/2 streams that all were waiting to send stream data and got canceled. The effect of getting canceled was each of them took a stack trace (inside of CreditWaiter's cancellation token callback), resulting in gigabytes of string data for those stack traces. I can't explain why this tool is attributing all of that to a single CreditWaiter, but I think it's likely it's a fluke of how the tool is tracking GC references, since a waiter is going to refer back to a stream which refers back to a connection which has a collection of all of the streams associated with that connection.

MihaZupan commented 3 years ago

Could be because CreditWaiter is a linked-list of waiters.

geoffkizer commented 3 years ago

I suspect you are right @stephentoub.

There's also something like 700MB in byte[], which is probably most buffers. That's a lot of buffers.

antoniofreire commented 3 years ago

Hi @stephentoub, sorry for the delay.

I've tried to come up with a repro, but so far i've not been successfully. I'll let you know as soon as i have it.

Let me give you a bit of context about this app: It receives change notification events from a producer, translate the events to a formal contract and notify, through HTTP, the subscribers about these changes. It's basically a message broker.

I think the problem comes from the fact that, from time to time, there are sudden burts of change notifications being produced, to such an extent that results in hundreds of thousands of concurrent requests per minute.

I suspect that some subscribers aren't able to keep up with this level of concurrent requests, so a lot of connections are being made. Maybe i should limit the number of connnections through MaxConnectionsPerServer property, but i'm afraid that would result in a lot of contention.

Maybe #44818 would be useful in this scenario.

karelz commented 3 years ago

Triage: Looks like we do not have much to make it actionable. As said above, there is ~1M of outstanding requests/streams. Note that HTTP/2 you are using does not automatically create multiple connections (unless you opt-in) and MaxConnectionPerServer does not apply to it at all. If there is a repro, or more details demonstrating what is wrong where, we would be happy to help and fix. Thanks!