swift-server / swift-service-lifecycle

Cleanly startup and shutdown server application, freeing resources in order before exiting.
https://swiftpackageindex.com/swift-server/swift-service-lifecycle/documentation/servicelifecycle
Apache License 2.0
396 stars 38 forks source link

Address some race conditions during the graceful shutdown process. #166

Closed tachyonics closed 8 months ago

tachyonics commented 11 months ago

Motivation:

Address and add tests for the following race conditions-

  1. If .signalSequenceFinished, .gracefulShutdownCaught or .gracefulShutdownFinished occur during graceful shutdown, gracefulShutdownIndex will be incremented without a service being shutdown causing the process to think the wrong service shutdown signal has been received next, failing shutdown incorrectly.
  2. If a signal is caught that isn't a cancellationSignal, the same thing will happen as (1).
  3. If a service throws out of order (the serviceGroup is waiting for ServiceB to gracefully shutdown but ServiceA throws an error), gracefulShutdownIndex will be incremented without waiting for the 'current' service to shutdown. This may incorrectly fail the shutdown process for some failureTerminationBehavior modes.
  4. The shutdownGracefully function cancels the group which may cause the cancellationCaught case in the _run loop to be triggered next. It is unneccesary to call cancelGroupAndSpawnTimeoutIfNeeded here in this case.

Modifications:

  1. Modified the task group result loop in shutdownGracefully to only increment the service iterator when the service being waited on has been confirmed as shut down.
  2. Track out-of-order service shutdowns using a mutable runningServices array.
  3. Add debug logs to signals that were previously silently consumed.
  4. Add a mechanism for injecting task group results for testing (if there is a better mechanism, let me know).
  5. Added unit tests to verify the race conditions addressed.

Result:

As verified by the added unit tests, the race conditions will be handled as expected.

FranzBusch commented 9 months ago

I landed both https://github.com/swift-server/swift-service-lifecycle/pull/171 and https://github.com/swift-server/swift-service-lifecycle/pull/170. Those two should fix all the issues that you have identified here from what I can see. I also brought over your unit tests that weren't injecting a stream since I want to avoid introducing test code into the implementation.

FranzBusch commented 8 months ago

@tachyonics closing this since I think all the problems that you listed here have been fixed on the latest release. Please feel free to reopen if you think they are not!