futureverse / future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://future.apply.futureverse.org
211 stars 16 forks source link

two minor questions #80

Closed waynelapierre closed 3 years ago

waynelapierre commented 3 years ago

I have two minor questions regarding this great package. First, should it be plan(multicore) or plan("multicore") if I want to use all my computer's cores? Second, what is the difference between future_mapply and future_Map? Thanks.

HenrikBengtsson commented 3 years ago

... should it be plan(multicore) or plan("multicore") ...

Both do the same thing, so it doesn't matter, but I recommend the first. The latter form is useful when you don't attach the future. For example,e you can do future::plan("multicore"). Without quotes, you had to do future::plan(future::multicore), or attach the package, e.g. library(future); plan(multicore)

... what is the difference between future_mapply and future_Map?

Since I don't have the time to explain this, I'll simply say it's the same as the difference between mapply() and Map() for base R. So, look at the help for those two, and other documentation sources that discuss those, and then the same apply to the future_:ized versions. Hope that helps.

waynelapierre commented 3 years ago

thanks for the clarification! Just another question, if fun already uses multiple cores, will using future_lapply(x, fun) be a problem?

HenrikBengtsson commented 3 years ago

Good question. No, there's built-in protection against this. See Section 'Nested Futures and Evaluation Topologies' in https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html.

HenrikBengtsson commented 3 years ago

I should clarify the latter; ... as long as fun() respects common R options and environment variables that controls the number of cores to run, e.g. mc.cores. If fun() uses a hard-coded number of workers as in mclapply(..., mc.cores = 5), instead of the default mc.cores = getOption("mc.cores", 2), there's nothing we can do about it. The worst case scenarios is when fun() uses mclapply(..., mc.cores = detectCores()). If they use mclapply(..., mc.cores = availableCores()), we're good.

waynelapierre commented 3 years ago

ok, thanks for the clarification!

waynelapierre commented 3 years ago

What if fun uses multiple cores via RcppParallel or OpenMP?

HenrikBengtsson commented 3 years ago

What if fun uses multiple cores via RcppParallel or OpenMP?

I don't know; it'll depends what settings the package coding with RcppParallel respects. It would be nice to know if they've got a standard. If you could inquire with them, that what be great.

For OpenMP, same. However, there is also a beta feature in future that currently only applies to the multicore backend and if option future.fork.multithreading.enable is set to FALSE. It attempts to force single-thread processing via the RhpcBLASctl. See ?future::future.options and this issue tracker for some discussion around this.