Closed kendonB closed 1 year ago
Apparently, there is a way to restrict the maximum number of jobs running at a time. It will probably be a SLURM environment variable. You might look at ?future.options
.
This is why drake
uses the jobs
argument to set the maximum number of simultaneous jobs. Unfortunately, it does not apply to future_lapply
.
Internally, batchtools::submitJobs()
is used. It takes an argument sleep
. It's help says:
If not provided (NULL), tries to read the value (number/function) from the configuration file (stored in reg$sleep) or defaults to a function with exponential backoff between 5 and 120 seconds.
I'm sure what "exponential backoff between 5 and 120 seconds" really means. @mllg, does this mean that the sleep time grows exponentially from a minimum 5 seconds to a maximym 120 seconds between jobs?
Now, future.batchtools
does not support specifying this sleep
argument (so it uses the default). I've added a FR #14 for this.
@wlandau-lilly, I have to think more about if future_lapply()
should do have a future.max.futures.at.any.time
-ish argument or that should/could be control elsewhere. I haven't though about it much before so I don't have a good sense right now. (Related to https://github.com/HenrikBengtsson/future/issues/159 and possibly also to https://github.com/HenrikBengtsson/future/issues/172).
future_lapply()
will "distributed" the N tasks to all K workers it knows of. For workers on a HPC scheduler, then default is K=+Inf. Because of this, it will distribute N tasks to N workers, that is, one task per worker, which is equivalent to one task per submitted job. In other words, if N is very large, future_lapply()
may hit the scheduler too hard when using plan(batchtools_slurm)
.
If you look at ?batchtools_slurm
you'll see argument workers
which defaults to workers = Inf
. (I do notice it is poorly documented/described). If you use, plan(batchtools_slurm, workers = 200)
, then future_lapply()
will resolve all tasks using K = 200 jobs. This means that each job will do single-core processing of N/K tasks.
Comment: The main rational for the workers
argument for batchtools_nnn
backends is that even if you could submit N single-task jobs, the overhead of launching each jobs is so high that the total overhead of launching jobs will significantly dominate the overall processing time.
Original comment: To update, I have been happily using workers = N
to work around this problem. The highest I've tried is workers = 500
and it worked fine.
Updated comment: The original version of this comment was plain wrong. The error just hadn't shown up. 500 seems to fail, 300 seems to fail, 200 seems to work fine. Even when sending more than 200, a bunch of jobs do start and, since drake is in charge, those resources aren't wasted.
@HenrikBengtsson from drake's point of view, this so-called "workaround" is actually an ideal solution in its own right. Here, imports and targets are parallelized with different numbers of workers, which is the right approach for distributed parallelism.
library(drake)
library(future.batchtools)
future::plan(batchtools_local(workers = 8))
# 4 jobs for imports, 8 jobs for targets:
make(my_plan, parallelism = "future_lapply", jobs = 4)
I will recommend this approach in the documentation shortly.
Apparently, there is a way to restrict the maximum number of jobs running at a time. It will probably be a SLURM environment variable. You might look at ?future.options.
Yes. It was buried in the configuration, but you can also control it via setting the resource max.concurrent.jobs
in the next version.
I'm sure what "exponential backoff between 5 and 120 seconds" really means. @mllg, does this mean that the sleep time grows exponentially from a minimum 5 seconds to a maximym 120 seconds between jobs?
Exactly. The sleep time for iteration i
is calculated as:
5 + 115 * pexp(i - 1, rate = 0.01)
But note that I discovered a bug lately so that the there was no sleeping at all :disappointed: This is fixed in the devel version which I plan to release this week.
There is currently no support for controlling the submission rate. I could however use the reported error message and treat the error as a temporary error which then automatically leads to the described sleep mechanism in submitJobs()
.
This problem appears to be solved with the latest version of batchtools. Feel free to close.
Related to this issue: I've changed the default number of workers on HPC schedulers from +Inf
to 100
in the next release (commit 1a547d99). The default can be set via an option or env var.
My SLURM system got upset when submitting a large number of jobs:
Perhaps one could solve this with an interface to the
sleep
option inbatchtools::submitJobs
?