Closed baxpr closed 2 years ago
btw I have not tested
Do jobs always enter the pending phase long enough for this to work? And if so, are we limiting the number of new jobs that launch each time dax runs? Is that what we want?
Ah, I don't know. Probably not.
The goal would be to limit the number of pending jobs to ~50 or whatever avoids too large of a hit when accre launches them all at once.
I'll add back a limit for total jobs as well, I think
... in which case, what do you think about a 2 sec sleep between individual job launches as well, in case accre is launching them immediately?
So overall, dax will launch as many jobs as possible each hour subject to
All three of these settings can be exposed in the instance redcap.
Could throttle by pending uploads as well, for that matter - no new launches until e.g. pending uploads are <2000
Testing full dax manager
run on ROGERSTEST (rogersbp@hickory)
Requires new fields main_queuelimit_pending
, main_limit_pendinguploads
in the instance redcap. We need to document the instances panel and provide an initial data dictionary for it, similar to the project settings info in docs/dax_manager.rst
... have not implemented a delay yet
Tested ok for a full build/launch/upload cycle on a single project, a few assessors. Next, test thresholds
With thresholds set to 1, only 1 job got launched. Would be helpful to report why launching stopped in the log.
Launch delay is working. @bud42 this is ready for another look. Not sure how it will interact with #369 though
No. I meant to do that via Template, hang on
Looks great! Let's merge this prior to #369. Hopefully, git will work it's magic!
@bud42 see if that should do it?
@bud42 can you look at this?
Instead of limiting launch based on total number of accre jobs, this will limit based on the number of pending accre jobs. This might help avoid hitting xnat with too many accre job starts at once.
It would also remove the limit on total running jobs, unless we add that in additionally. Not sure what effect that would have, but I think ACCRE's own scheduling will keep that under control just by keeping the pending list full.
If we install this, we would also need to drop the queue_limit setting in the instance dashboard to more like 20-50 instead of 200-500.