Closed joe-at-startupmedia closed 2 years ago
Yup, I can see that would be useful. I'll have more of a think about this, as I'd like to improve on monitoring/tracking of jobs (including being able to e.g. find all timed out jobs), which would help here.
I've got notes somewhere on a few solutions I was considering (separate queues based on final status as you suggested was one of them, maintaining an index somewhere to allow O(1) retrieval was another), I'll see if I can dig those out sometime.
This is how I solved it. Note that my branch is not merge-worthy due to my prefacing the key names with the namespace. https://github.com/joe-at-startupmedia/ocypod/commit/ed9ca25a8f1b07e8451f88029a2064992922c59e
I've noticed however that in addition to simply adding the job id to the timedout
list, several other scenarios need accounted for:
timed_out
to be expired from the system? The point here is that they've basically failed and I'd personally like to keep a persistent record of what failed without automatically purging based on the value specified for expiry_check_internal
. This way I can later manually requeue the job if need be (as explained in #23)timedout
list. Called by both check_job_expiry
and when deleting a job.
https://github.com/joe-at-startupmedia/ocypod/blob/bf9775b7e3c862e8c9446b7fa1c0036fc8660cb6/src/application/job.rs#L556timedout
list.
https://github.com/joe-at-startupmedia/ocypod/blob/bf9775b7e3c862e8c9446b7fa1c0036fc8660cb6/src/application/job.rs#L174timedout
list. Called by setting the status to cancelled
. https://ocypod.readthedocs.io/en/latest/api/#patch-jobjob_id
https://github.com/joe-at-startupmedia/ocypod/blob/bf9775b7e3c862e8c9446b7fa1c0036fc8660cb6/src/application/job.rs#L199server_info
function? Called by the GET/ info
request. The problem here is that the timed_out job id would present in both the new timedout
list as well as the ended
so it's unnecessary.
https://github.com/joe-at-startupmedia/ocypod/blob/bf9775b7e3c862e8c9446b7fa1c0036fc8660cb6/src/application/manager.rs#L82job_ids
function? Called by the GET /queue/{queue_name}/job_ids
request. The problem here is that the timed_out job id would present in both the new timedout
list as well as the ended
list resulting in duplicates.
https://github.com/joe-at-startupmedia/ocypod/blob/bf9775b7e3c862e8c9446b7fa1c0036fc8660cb6/src/application/queue.rs#L192Other concerns:
timed_out
which is a job status. Further TIMEDOUT
is two words and thus deserves an underscore separation.ended
and timedout
. The problem with this is that it will result in duplicates being present upon calling job_ids or info as explained in items 5 and 6 above.Example of duplicates:
curl localhost:8023/queue/[queue_name]/job_ids
{"running":[],"completed":[],"cancelled":[],"timed_out":[3,3],"queued":[],"failed":[]}
Hello Dave, great tool here. Here is my first issue. Jobs which have timed out, and no longer retried, are moved to the
ended
list. There is no remaining index (redis key) to retrieve jobs which have a) timed out and b) have zero retries left without querying each job with a timed_out status. Ideally it would be nice to achieve this in constant time not order of (n). The solution here would be to add another redis list calledtimed_out
which would allow you to retrieve the jobs which have timed_out. There is thefailed
list but they do not end up there after the retries have been exhausted. Thoughts?