dotnet / extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.
MIT License
2.59k stars 741 forks source link

Stopping Multiple HostedServices in GenericHost using Ctrl-C or docker stop on container #1842

Closed chrisckc closed 4 years ago

chrisckc commented 5 years ago

Not sure if this is an issue or by design but i have been experimenting with the GenericHostSample code from here:

https://github.com/aspnet/Extensions/tree/master/src/Hosting/samples/GenericHostSample

and found a couple of things which may need to be addressed.

I have created a repo based on the above code to test and demonstrate the issues:

https://github.com/chrisckc/DotNetCoreGenericHostSample

The Readme.md explains the issues, but i will point them out here:

First issue:

The Sample does not indicate how to properly cleanup after running a task when using MyServiceA which inherits from BackgroundService. Using the Sample code as-is with just MyServiceA added, pressing Ctrl-C results in any code after

await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken);

not running due to the fact that cancelling a running Task raises an Exception, here is the console output from the repo i created above:

dotnet run -p DotNetCoreGenericHostSample.csproj 
Application started. Press Ctrl+C to shut down.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:20:46 : MyServiceA constructed.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:20:46 : MyServiceA background task is starting.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:20:46 : MyServiceA is doing background work.
Hosting environment: Production
Content root path: /Users/........../DotNetCoreGenericHostSample/bin/Debug/netcoreapp2.2/
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:20:51 : MyServiceA is doing background work.
^C
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:20:52 : MyServiceA is stopping.
LocalHost:DotNetCoreGenericHostSample username$ 

Notice that the "MyServiceA background task is stopping" message is not shown. The issue can be rectified by changing the Delay code to this so that the Exception is swallowed:

await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken).ContinueWith(task => { });

Resulting in the following console output, notice that the "MyServiceA background task is stopping" message is shown and clean-up completed.

dotnet run -p DotNetCoreGenericHostSample.csproj 
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:05 : MyServiceA constructed.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:05 : MyServiceA background task is starting.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:05 : MyServiceA is doing background work.
Application started. Press Ctrl+C to shut down.
Hosting environment: Production
Content root path: /Users/........./GitHubProjects/DotNetCoreGenericHostSample/bin/Debug/netcoreapp2.2/
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:10 : MyServiceA is doing background work.
^C
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:11 : MyServiceA is stopping.
info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:11 : MyServiceA background task is stopping.
LocalHost:DotNetCoreGenericHostSample username$ info: DotNetCoreGenericHostSample.MyServiceA[0]
      13/06/2019 21:37:12 : MyServiceA background task clean-up completed.

LocalHost:DotNetCoreGenericHostSample username$ 

Second Issue:

When the GenericHost is stopping multiple concurrent HostedServices due to Ctrl-C, the services are stopped sequentially, each stop operation seems to be blocked until the HostedService which is being stopped has finished stopping, which could include running any required clean-up code etc.

This does not appear to cause an issue as long as the Tasks being ran in each HostedService are setup to accept the cancellation token as per the sample, however the logs look rather odd as you can see that another service starts its background work cycle after one service has announced that it is stopping. If the long running Tasks in each HostedService accept cancellation tokens, they are immediately cancelled just after being started so everything stops quickly (as quick as the time taken for any clean-up code to run). Really they should not have been started in the first place because of the check in the while loop while (!stoppingToken.IsCancellationRequested) but because each Stop operation is triggered sequentially, the cancellation token in each HostedService Instance is only triggered after the previous HostedService has finished stopping.

I created 3 new Services, MyServiceX, MyServiceY, and MyServiceZ which are basically copies of MyServiceA in the repo, to test the above scenario you will need to uncomment the Delay line in each of those services which accepts the cancellation token and comment out the one which does not. Also go into ProgramHelloWorld and uncomment those 3 services instead of A.

Third issue:

If MyServiceX, MyServiceY, and MyServiceZ use a Delay which does not accept a cancellationToken, the GenericHost crashes shortly after pressing Ctrl-C. This only happens when more than 1 of the services is added in the DI config and so the GenericHost attempts to stop more than 1 service.

To test this use the original code in MyServiceX, MyServiceY, and MyServiceZ using the Delay which does not accept cancellation token.

Here is what the console output looks like:

dotnet run -p DotNetCoreGenericHostSample.csproj 
info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:09:50 : MyServiceX constructed.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:09:50 : MyServiceY constructed.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:09:50 : MyServiceZ constructed.
info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:09:50 : MyServiceX background task is starting.
info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:09:50 : MyServiceX is doing background work.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:09:50 : MyServiceY background task is starting.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:09:50 : MyServiceY is doing background work.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:09:50 : MyServiceZ background task is starting.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:09:50 : MyServiceZ is doing background work.
Application started. Press Ctrl+C to shut down.
Hosting environment: Production
Content root path: /Users/......./DotNetCoreGenericHostSample/bin/Debug/netcoreapp2.2/
info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:09:55 : MyServiceX is doing background work.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:09:55 : MyServiceY is doing background work.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:09:55 : MyServiceZ is doing background work.
^C
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:09:56 : MyServiceZ is stopping.
LocalHost:DotNetCoreGenericHostSample username$ info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:10:00 : MyServiceX is doing background work.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:10:00 : MyServiceY is doing background work.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:10:00 : MyServiceZ background task is stopping.
info: DotNetCoreGenericHostSample.MyServiceZ[0]
      13/06/2019 22:10:01 : MyServiceZ background task clean-up completed.
info: DotNetCoreGenericHostSample.MyServiceY[0]
      13/06/2019 22:10:01 : MyServiceY is stopping.
info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:10:01 : MyServiceX is stopping.

Unhandled Exception: System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at Microsoft.Extensions.Hosting.Internal.Host.StopAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.WaitForShutdownAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at DotNetCoreGenericHostSample.ProgramHelloWorld.Main(String[] args) in /Users/...../DotNetCoreGenericHostSample/ProgramHelloWorld.cs:line 42
   at DotNetCoreGenericHostSample.ProgramHelloWorld.<Main>(String[] args)

To make this worse, when running in docker and inspecting to container logs after stopping the container, the Exception is not there, the last entry is

info: DotNetCoreGenericHostSample.MyServiceX[0]
      13/06/2019 22:10:01 : MyServiceX is stopping.

This can be fixed with a try catch and console log as mentioned in the Readme.md, but only for the ProgramFullControl setup.

Expected behavior

For the sample to demonstrate usage closer to real world and show how to properly stop and cleanup multiple HostedServices

Also real work usage will often require a delayed start of the HostedServices if they access databases and everything is running in docker due to startup timing issues. I have added a simple solution for this as mentioned in the Readme.m

I intend to add a copy of the BackgroundService base class in my repo and have a play around to find out the cause of the multiple services issue, but time has other demands at the moment.

Possibly loosely related:

https://github.com/aspnet/Extensions/issues/813

https://github.com/aspnet/Extensions/issues/1245

OS: MacOS 10.14.4 SDK 2.2.106

chrisckc commented 5 years ago

According to the Docs: https://docs.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-2.2&tabs=visual-studio#ihostedservice-interface

The cancellation token has a default five second timeout to indicate that the shutdown process should no longer be graceful. When cancellation is requested on the token:

Any remaining background operations that the app is performing should be aborted. Any methods called in StopAsync should return promptly.

However, tasks aren't abandoned after cancellation is requested—the caller awaits all tasks to complete.

I am not sure exactly what this means, the last sentence seems to contradict the previous statements.

To extend the default five second shutdown timeout, set:

ShutdownTimeout when using Generic Host. For more information, see .NET Generic Host.

In dotnet 2.2 there is no option to specify a shutdown timeout on either StopAsync() or WaitForShutdownAsync(), despite the Docs being for 2.2

If all stop operations ran concurrently, a low shutdown timeout should be ok, but as they seem to be running sequentially, a much longer timeout would be required.

analogrelay commented 5 years ago

Not sure if this is an issue or by design but i have been experimenting with the GenericHostSample code from here:

/src/Hosting/samples/GenericHostSample@master

A brief warning that the samples in this repo are not generally designed as documentation. They are samples for the engineering team to explore issues. Our "official" samples are part of our documentation. Having said that, if they are confusing, we should consider how we can address that :).

Re the first issue: It's definitely the case that a cancellation token firing will trigger OperationCancelledException. The general way to write clean-up code such that it runs even in cancellation is to put that clean-up code in a finally block, or explicitly catch and handle the OperationCancelledException. This will also ensure the clean-up code runs even when an unexpected exception is thrown. I'm surprised that the code you .Register to the stoppingToken doesn't run though (you mention that the MyServiceA is stopping. message is not logged).

When the GenericHost is stopping multiple concurrent HostedServices due to Ctrl-C, the services are stopped sequentially, each stop operation seems to be blocked until the HostedService which is being stopped has finished stopping

Yes, this is by-design. As you noted, our docs indicate "any methods called in StopAsync should return promptly". The last sentence in those docs doesn't quite contradict the previous statement though it isn't the clearest. When the shutdown becomes disgraceful (i.e. the StopAsync token is triggered), we don't change our logic, we continue awaiting your StopAsync task. It's up to your implementation to shutdown more promptly in that case. Hence the way it says "Any methods called in StopAsync should return promptly" when the token is triggerd, but also "tasks aren't abandoned after cancellation is requested". We could stop awaiting your StopAsync task and move on to the next service when the token is triggered, but we don't do that. Shutting down more quickly is up to your service.

you can see that another service starts its background work cycle after one service has announced that it is stopping

Yep, this is also to be expected. BackgroundService signals the ExecuteAsync cancellation token and then awaits your ExecuteAsync task. Thus if you don't shutdown quickly, another background service (who's StopAsync hasn't run yet) may start up and run.

analogrelay commented 4 years ago

Closing inactive discussion thread.