Open dazbradbury opened 2 years ago
@odinserj - How feasible do you think it would be to provide a way to obtain the running jobs list, whilst ignoring cancelled jobs?
Database and application state are always potentially unsynchronised, because we are in the environment when every next line of code is potentially not executed due to unexpected process shutdown. It's simply impossible to ideally synchronise two different physical entities at different points of state that can be queried independently.
I see two options of maintaining more or less synchronised list of processing jobs:
Both tasks can be created as an extension filter, like CaptureCultureAttribute
, and since ideal implementation with perfect synchronised list is not possible anyway, this shouldn't be a part of Hangfire.Core – ProcessingJobs method currently works like (2) anyway, and eventually aborted jobs will be processed again.
Thanks for your detailed response - I was wondering if it's feasible for the following flow:
1) Process cancellation token is fired, and OperationCanceledException
captured by hangfire for the particular job
2) As currently, this is logged with:
Cancellation token fired and handled by job. Worker stop requested while processing background job 'XYZ'. It will be re-queued.
3) A flag is set in the database to state this job was cancelled 4) When the job is re-queued, this flag is reset
If (3) doesn't happen, then we're in the same state as now, so no harm done. If (4) fails, then much like a job failure it can be re-attempted without any side-effects / harm done.
Now when the call is made to:
JobStorage.Current.GetMonitoringApi().ProcessingJobs(...);
It can also return the state of this flag (as set in the DB), which would determine if a job has been cancelled successfully or not. Does this seem feasible? Whilst there is no guarantee of synchronisation here, it does provide the additional information where possible.
May be, but it's totally unclear where to write this flag – ProcessingJobs index doesn't have anything for data, and can't have in the current implementation and APIs. With JobParameters table it will be hard to understand to which execution it relates. So I don't see any natural solution for this that's general.
For a particular application and use case this can be done with a server filter that intercepts OnPerformed phase and checks whether there's an OperationCanceledException
and context.Stopping
token is activated, and records this flag somewhere, maybe even in the JobParameters table, but there might be some cases when it's reported as canceled, but actually running now.
Following https://github.com/HangfireIO/Hangfire/issues/2026, it became clear that the following call:
Includes jobs where the cancellation token has fired, and
OperationCanceledException
thrown. In other words, it includes jobs that aren't actually running.In order to be able to work out, and alert about, any jobs where the cancellation tokens didn't fire, it would be extremely useful if there was a way to obtain only actually running jobs, or filter out jobs that have thrown the
OperationCanceledException
from theProcessingJobs
list.This would allow, for example, the ability to ignore certain jobs if they are stopped non-gracefully whilst alerting / flagging other jobs.
There is a current workaround, in that where a cancellation token is thrown, Hangfire will log:
Worker stop requested while processing background job 'XXX'. It will be re-queued.
However, this means comparing the running list, to the log messages, so doesn't allow for conditional alerting / logic depending on the true state of the jobs during shutdown/cancellation events. Without being able to see the true running list, the choice is to alert about every time hangfire server doesn't shut down gracefully, or never alert about it. It's not possible to ignore ungraceful shutdowns on particular jobs, for example.