This PR fixes the issue with dead edge jobs stay in edge_job table. If worker died or was not able to update the state of a job, the task will stay forever in the table. To fix this the job last_update will be checked with the SCHEDULER_ZOMBIE_TASK_THRESHOLD time to detect zombie task and state will be set to REMOVED. A job in state REMOVED will be deleted after job_fail_purge time archived
Details about changes
Detect orphaned tasks after SCHEDULER_ZOMBIE_TASK_THRESHOLD
Add REMOVED job state
Remove REMOVED state jobs after job_fail_purge time.
Description
This PR fixes the issue with dead edge jobs stay in edge_job table. If worker died or was not able to update the state of a job, the task will stay forever in the table. To fix this the job last_update will be checked with the SCHEDULER_ZOMBIE_TASK_THRESHOLD time to detect zombie task and state will be set to REMOVED. A job in state REMOVED will be deleted after job_fail_purge time archived
Details about changes