dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Exception from CancellationTokenSource in Finalizer of FileSystemWatcher taking down whole application #31283

Open videokojot opened 4 years ago

videokojot commented 4 years ago

Problem: We are running into unsolvable problem, our application is taken down by this exception (and thats everything we were able to log from it in UnhandledException event) and we do not know how to localize it. This is from our logs:

Unhandled exception AggregateException: One or more errors occurred. (Object reference not set to an instance of an object.); InnerException: Object reference not set to an instance of an object. caught by CurrentDomain.UnhandledException. Stacktrace: at System.Threading.CancellationTokenSource.CallbackNode.<>c.<ExecuteCallback>b__10_0(Object s)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
   --- End of inner exception stack trace ---
   at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
   at System.IO.FileSystemWatcher.StopRaisingEvents()
   at System.IO.FileSystemWatcher.Dispose(Boolean disposing)
   at System.ComponentModel.Component.Finalize()

Details: Some (maybe) relevant information:

Also if we run: kubectl logs ${pod} --previous we can see this:

...
   State:          Running 
   Started:      Thu, 24 Oct 2019 10:28:21 +0200
   Last State:     Terminated
   Reason:       Error
    Exit Code:    134
    Started:      Thu, 24 Oct 2019 10:01:53 +0200
    Finished:     Thu, 24 Oct 2019 10:28:20 +0200   
...

The result from: kubectl logs $(kubectl get pods --selector=app=dataconnector -o=name) --previous is the same as the logged exception except there is line Aborted (core dumped)

What we tried (nothing helped):

I am not sure if it is belong here, but FileSystemWatcher.cs is from here, but it can be in coreclr (because of CancelationTokeSource.cs) or in apsnetcore...

Are you having any thoughts? Currently this is blocking us moving forward, as all of our microservice are dying unexpectedly...

stephentoub commented 4 years ago

we are using .NET Core 2.2

What exact version of the runtime? This looks very similar to https://github.com/dotnet/corefx/issues/33844, which was fixed in .NET Core 2.2.1.

videokojot commented 4 years ago

@stephentoub we are using

FROM mcr.microsoft.com/dotnet/core/sdk:2.2 AS build-restore for build and FROM mcr.microsoft.com/dotnet/core/aspnet:2.2 AS runtime-image for runtime

it seem that they are currently using 2.2.402 (sdk) and 2.2.7 (runtime img) according their docker files: https://github.com/dotnet/dotnet-docker/blob/master/2.2/sdk/stretch/amd64/Dockerfile and https://github.com/dotnet/dotnet-docker/blob/master/2.2/aspnet/stretch-slim/amd64/Dockerfile

We have run docker prune only few days ago, so it is not possible, that there would be older version of 2.2. cached...

Thank you for looking into this.