MiniProfiler / dotnet

A simple but effective mini-profiler for ASP.NET (and Core) websites
https://miniprofiler.com/dotnet/
MIT License
2.91k stars 601 forks source link

Unobserved faulted tasks since upgrading to 4.0.138 #361

Closed dazbradbury closed 4 years ago

dazbradbury commented 5 years ago

Since upgrading from V3.2.0.157, to V4.0.138, we're seeing a large number of unobserved faulted tasks as such:

Unobserved faulted task: AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread.

InnerException:
    NullReferenceException: Object reference not set to an instance of an object.
       at System.Web.ThreadContext.AssociateWithCurrentThread(Boolean setImpersonationContext)
   at System.Web.HttpApplication.OnThreadEnterPrivate(Boolean setImpersonationContext)
   at System.Web.HttpApplication.System.Web.Util.ISyncContext.Enter()
   at System.Web.Util.SynchronizationHelper.SafeWrapCallback(Action action)
   at System.Threading.Tasks.Task.Execute()

AggregateExceptions:
{
    NullReferenceException: Object reference not set to an instance of an object.
       at System.Web.ThreadContext.AssociateWithCurrentThread(Boolean setImpersonationContext)
   at System.Web.HttpApplication.OnThreadEnterPrivate(Boolean setImpersonationContext)
   at System.Web.HttpApplication.System.Web.Util.ISyncContext.Enter()
   at System.Web.Util.SynchronizationHelper.SafeWrapCallback(Action action)
   at System.Threading.Tasks.Task.Execute()
}

This is in a multi-server environment, using SQL Server storage.

It seems like perhaps miniprofiler has some Tasks that are throwing uncaught exceptions?

Many people may not notice these, as you need to be hooked into TaskScheduler.UnobservedTaskException, but would be keen to understand the root cause so they can be cleaned up.

NickCraver commented 5 years ago

We haven't seen this anywhere else AFAIK, but due to the nature of async continuations, any calls anywhere that were previously synchronous through MiniProfiler layers and are now properly async as fair game for reporting back up the chain. For example any async connection is now allowed to be fully async, and complete after the request unlike previously where it wasn't really supported. In short, we fully allow you to shoot yourself in the foot with unawaited tasks (as any library should really), starting in 4.x.

Can you get any information on what is throwing? I'd suggest running an analyzer and looking for anything unawaited. For example, var result = myConnection.QueryAsync<int>("..."); won't result in a compiler warning, but isn't awaited at all and will cause this type of behavior. I've unfortunately seen this soooo many times.

dazbradbury commented 5 years ago

We're using Miniprofiler in the specified (and reasonably straight forward) manner, using blocks around timings, and a start / end on pre-request / end request. I don't believe we're using any Async methods directly, at all. For example:

StackExchange.Profiling.MiniProfiler.StartNew(); StackExchange.Profiling.MiniProfiler.Current?.Stop(); ... using (profiler.Step("Time this named section")) { ... }

As such, is your suggestion to pull in the miniprofiler source code and run an analyser on it directly? I could certainly take a look.

Otherwise, unless I'm missing something, it would seem the library itself is causing the unobserved faulted tasks, using the standard library methods.

NickCraver commented 5 years ago

It's a bit confusing (as always with async), but MiniProfiler can be the victim of any async unawaited code downstream (same as any other library). Or it may just be unrelated completely to the issue.

MiniProfiler explicitly calls .ConfigureAwait(false) internally to not care about the outer sync context, so as long as you're awaiting the result, you'll "snap back into the world" as it were and regaining context there. It's that process that's blowing up, because something has likely run past the lifetime of an HTTP request and is trying to complete...but its sync context (the request) is already gone. This is most often the result of non-awaited code.

We should have heard many, many reports of this if it's inside MiniProfiler, but this is all I've heard about. I just ran another batch of analyzers across the codebase to find anything non-no-contexted or unawaited, still nothing there. And MiniProfiler kicks off no tasks on its own (it simply has no scheduled or async operations to perform).

I think your best bet at finding the leak is a memory dump here, and seeing what's local/available on that context. This isn't uncommon by the way, I've seen it across a dozen libs and dozens of cases...and it's always a "ohhhhhhh *@#$%" moment when you find that unawaited async call.

dbolin commented 5 years ago

It's not necessarily because of not awaiting something within the request flow - I've seen this happen when the request is cancelled e.g. by the client, and so the request processing would end almost immediately after that, and any tasks in progress would attempt to resume in the no longer existent sync context.

I don't know if passing the cancellation token for the request into all the async calls would have avoided this... I would assume so, but didn't test that theory.

NickCraver commented 4 years ago

Closing this out due to inactivity and no more reports - it's very likely application specific and we'd need to do a lot more debugging to find the cause. I tried to lay out how this happens above.

On cancellation tokens: we don't want to inherently throw away the profilers for every request, so that's not a trivial change to make w.r.t. behavior. I think the current approach remains the best option for the vast majority of users.