Closed mb706 closed 4 months ago
Thanks for reporting on this. This is related to a new built-in protection parallelly 1.37.0 (2024-02-14);
makeClusterPSOCK(nworkers) gained protection against setting up too many localhost workers relative to number of available CPU cores. If nworkers / availableCores() is greater than 1.0 (100%), then a warning is produced. If greater than 3.0 (300%), an error is produced. These limits can be configured by R option parallelly.maxWorkers.localhost. These checks are skipped if nworkers inherits from AsIs, e.g. makeClusterPSOCK(I(16)). The current 3.0 (300%) limit is likely to be decreased in a future release. A few packages fail R CMD check --as-cran with this validation enabled. For example, one package uses 8 parallel workers in its examples, while R CMD check --as-cran only allows for two. To give such packages time to be fixed, the CRAN-enforced limits are ignored for now.
I've thought about how to best handle by future and nested setups, like yours, and I think it's best if users declares that they really++ wants to do nested parallelization by using the As-Is I(.)
specification, i.e.
plan(list(tweak(multisession, workers = 4), tweak(multisession, workers = I(4))))
I've updated https://future.futureverse.org/articles/future-3-topologies.html to use this.
@HenrikBengtsson - Apologies just found my way to this issue after being very confused by the example in the "A Comprehensive Overview" vignette. I'm going to guess I'm not the only one that tried to adapt the example shown there, that is I appreciate you have added more complete documentation in the topologies vignette but I'm wondering if it would be worth adding a note or additional explanation in the overview vignette to make this more clear ?
@HenrikBengtsson - Apologies just found my way to this issue after being very confused by the example in the "A Comprehensive Overview" vignette. I'm going to guess I'm not the only one that tried to adapt the example shown there, that is I appreciate you have added more complete documentation in the topologies vignette but I'm wondering if it would be worth adding a note or additional explanation in the overview vignette to make this more clear ?
Add/tracked in #745
When trying to run nested parallellization using
multisession
where the inner level has more than 3 workers, theparallelly
package complains about overcommitting CPUs:Am I missing something here? An example very much like this is in the overview vignette (ctrl-f
tweak
). Maybe themc.cores
option should be set to (the inner value of)nbrOfWorkers()
instead of 1 on worker processes.Since I am using
future
indirectly through another package, it was not an option for me to just overwritemc.cores
myself. My workaround for now is to set theparallelly
environment variable that overrides the checks; this has to be done before setting theplan()
.