Open ramonsmits opened 1 year ago
When a shutdown is initialized after this the shutdown will timeout after 30 seconds
The message pump failed to stop with in the time allowed(30s)
2023-05-23 08:46:34.0837|14|Error|NServiceBus.CustomChecks.TimerBasedPeriodicCheck|'ServiceControl.CheckRavenDBIndexLag' implementation failed to run.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
2023-05-23 08:54:53.2517|7|Warn|ServiceControl.Audit.Auditing.AuditPersister|Bulk insertion failed
System.Threading.Tasks.TaskCanceledException: A task was canceled.
2023-05-23 08:54:53.3611|7|Info|ServiceControl.Audit.Auditing.AuditIngestion|Ingesting messages failed
System.Threading.Tasks.TaskCanceledException: A task was canceled.
2023-05-23 09:31:34.1249|11|Error|NServiceBus.CustomChecks.TimerBasedPeriodicCheck|'ServiceControl.Audit.Auditing.FailedAuditImportCustomCheck' implementation failed to run.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
2023-05-23 10:44:40.3732|10|Info|Microsoft.Hosting.Lifetime|Application is shutting down...
2023-05-23 10:44:40.3732|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Start/stop semaphore acquiring
2023-05-23 10:44:40.3732|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Start/stop semaphore acquired
2023-05-23 10:44:40.3732|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Infrastructure shut down commencing
2023-05-23 10:44:40.3732|16|Info|NServiceBus.Raw.RunningRawEndpointInstance|Stopping receiver.
2023-05-23 10:45:10.4110|16|Error|NServiceBus.Transport.Msmq.MessagePump|The message pump failed to stop with in the time allowed(30s)
2023-05-23 10:45:10.4110|16|Info|NServiceBus.Raw.RunningRawEndpointInstance|Receiver stopped.
2023-05-23 10:45:10.4110|16|Info|NServiceBus.Raw.StoppableRawEndpoint|Initiating shutdown.
2023-05-23 10:45:10.4110|16|Info|NServiceBus.Raw.StoppableRawEndpoint|Shutdown complete.
2023-05-23 10:45:10.4110|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Infrastructure shut down completed
2023-05-23 10:45:10.4110|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Start/stop semaphore releasing
2023-05-23 10:45:10.4110|16|Info|ServiceControl.Audit.Auditing.AuditIngestion|Shutting down. Start/stop semaphore released
The audit instance uses a custom CustomCheckManager which was not handling OperationCancelled exceptions - this was rectified in https://github.com/Particular/ServiceControl/pull/3602.
The audit instance uses NServiceBus.CustomChecks v3. In that version, the custom check runner does not handle OperationCancelled exception hence we're seeing occasional errors for some of the audit instance custom checks when the instance is being shutdown during their run. In v4 of NServiceBus.CustomChecks the problem is fixed hence when we upgrade SC to use NSB 8 the main instance will be sorted as well.
@jpalac @tmasternak so this issue can still happen in the log files of SC? Can't we just add a try catch in the current custom checks code itself then?
The issue is that otherwise the log still can have this log entries until we upgrade? Log shouldn't contain such entries so it feels as this was closed to fast as we don't know when SC will be upgraded to v8.
@ramonsmits This is not going to happen for the main instance (this has been fixed). With the audit instance, you are correct, that we would need to tweak each custom check there exists to prevent this from happening.
we don't know when SC will be upgraded to v8.
Our assumption was that we should do this when we provide support for contenerization.
I'm reopening this so that either catching of OCE's is done in the custom tasks or when all instances have upgraded to Core v8.
Describe the bug
Description
Many TaskCanceledException but unclear why these are logged. Maybe this happens during shutdown but then these log entries should NOT be written and TaskCanceledException gracefully handled.
Expected behavior
Improved logging that shows WHY these log entries are canceled. What CancellationTokenSource is responsible for cancellation.
Actual behavior
Many TaskCanceledException but unclear why these are logged.
Versions
Unclear but likely 4.30.0
Please list the version of the relevant packages or applications in which the bug exists.
Steps to reproduce
Unknown
Relevant log output
Additional Information
Workarounds
Possible solutions
Additional information