tobymao / saq

Simple Async Queues
https://saq-py.readthedocs.io/en/latest/
MIT License
532 stars 37 forks source link

job stuck in active state if saq process got killed #127

Closed tiejunhu closed 2 months ago

tiejunhu commented 2 months ago

When the saq process doesn't exit cleanly, the current active jobs got stuck in active state and never got retried after saq restarted.

I believe the heartbeat property is not designed for this scenario, the sweep job aborts the job with heartbeat timeout. But for this scene, the job should be retried.

I suggest the job should record it's worker ID, and if the sweep finds that worker is not available anymore, the job should get retried.