post_job never called for periodic jobs?

holtgrewe commented 3 years ago

Hi, I might be mistaken but for periodic jobs, the post_job handler appears to be never called?

trel commented 3 years ago

Does the pre_job handler run before each firing of the periodic? If so, then yes, agreed that the post_job should run after each firing as well.

holtgrewe commented 3 years ago

Yes, pre_job is called.

trel commented 3 years ago

we've confirmed a lack of firing.

discussing adding a new pair of well-named pre/post hooks - so, in total, there become a set of four:

pre_outer (once per job/task)
pre_inner (once per iteration, when periodic)
post_inner (once per iteration, when periodic)
post_outer (once per job/task)

the names are placeholders... input welcome.

candidates include:

pre/post_job (which one does this best suit, inner or outer?)
pre/post_iteration (most clear, but... ew?)
pre/post_periodic (not terribly clear whether this is inner or outer, or only applies to periodic jobs)
pre/post_task (also... a celery word)

or... these two, as a pair, convey they're related... but distinct and clear?

pre/post_job
pre/post_job_iteration

holtgrewe commented 3 years ago

I think "iteration" is not that clear as there are many iterations in the ingest process. I would suggest "cycle" as this appears to be a term related to periodic in Physics.

My 2 cents.

trel commented 3 years ago

ah, cycle could work...

pre/post_job
pre/post_job_cycle

d-w-moore commented 3 years ago

Ah, nice. it beats the names I was using, which were "pre_all" and "post_all" : )

d-w-moore commented 3 years ago

On second thought, that naming convention would rename existing callbacks ... so I'm thinking instead that we keep these as the "inner" calls to be called before and after each repetition (whether periodic or not):

pre_job
post_job

and add "outer" calls (if these are deemed desirable or even necessary):

setup_job
teardown_job

These two extra callbacks might, for example, convey the job_name in case the user had requirements for (de-)initialization based on that. @holtgrewe , any thoughts?

d-w-moore commented 3 years ago

All -- on more consideration, I'm going toward a simpler approach, one that uses the celery pre/post task mechanism, to re-implement the pre and post jobs. It may tie us harder to the choice of celery as a task queue, but I think it's the cleanest way right now to ensure proper operation. It will also make it easier to detect the failure and success of individual sync_dir invocations' success and failure, if that is a concern.

To revise my last comment above, since the setup and teardown are not a thing yet and havea not been explicitly called for , I'm going to leave those unimplemented for now.

trel commented 3 years ago

okay, go for it - and we'll see how that looks.

d-w-moore commented 3 years ago

Some clarifications are in order here - per team discussion and chats this morning:

_pre/post_job_ are called around each periodic call to the entire recursive scan. (This means for a non-periodic job, they each only get called once). Just to be explicit, they will not be called for each subdirectory encountered in the scan, only for the toplevel directory.
there will be no explicit "one-time" setup/teardown callback unless need for such is expressed from the community.

rmoreas commented 1 year ago

pre/post_job are called around each periodic call to the entire recursive scan

This seems not to be the case for version 0.4.0. Is this already implemented?

alanking commented 1 year ago

@d-w-moore can correct me if I'm wrong, but I believe that the comment is stating a requirement which has not yet been implemented, but will be in a future release. At present, my understanding is that post_job is never called, hence the issue.

Hope that helps!

d-w-moore commented 1 year ago

@alanking, right that was the plan, but post_job still does not fire.

rmoreas commented 1 year ago

Possible because done is always false if the job is periodic at line 81 of sync_task.py?

d-w-moore commented 1 year ago

Possible because done is always false if the job is periodic at line 81 of sync_task.py?

@rmoreas - Could be , will do some poking around. On a slightly different angle, I think we may have discussed getting rid of the periodic thing as it seems to introduce some chaos. Perhaps define a method that calculates an appropriate time for the next execution, based on the finishing time of the current task.

rmoreas commented 1 year ago

Our preferred platform to deploy this capability would be on Kubernetes. In that case a K8s CronJob may be a good alternatief to fire single pass jobs. These could be launched synchronously too so that the K8s CronJob controller can avoid launching the same job while another is still running, and do proper handling of missed deadlines. What do you think?

d-w-moore commented 1 year ago

In my mind, cron jobs ( even in Linux / Unix ) are a better facility for such things, being singly dedicated to that purpose. Kubernetes CronJob, being a more modern implementation than the Unix one, might have an even more robust approach. However, I have little familiarity or experience with K8s, so I'd be curious to see how you imagine the interface might work between K8s and the Automated Ingest.

irods / irods_capability_automated_ingest

post_job never called for periodic jobs? #139