Parallel Processing vignette

curso-r / treesnip

Parsnip backends for `tree`, `lightGBM` and `Catboost`

https://curso-r.github.io/treesnip

GNU General Public License v3.0

85 stars 13 forks source link

Parallel Processing vignette #21

Open topepo opened 3 years ago

topepo commented 3 years ago

I would suggest that, when using tune, the standard foreach parallelism be suggested and the model-specific threading methods be used if just parsnip is being used to fit.

Generally, parallelizing the resamples is faster than the individual models (see xgboost example). We always try to parallelize the longest running loop.

Athospd commented 3 years ago

Hey, Max, glad to see you here. I was writing about forking and then I decided to perform a benchmark to enrich the vignette. I was expecting to corroborate your findings but I ended up with counter-intuitive results.

tl;dr: pure forking or pure threading wasn't the best: 2 threads with 4 workers was the fastest setup.

see here https://curso-r.github.io/treesnip/articles/threading-forking-benchmark.html

Do you think that it is worth it to consider these combinations? Or is it better to stick with the simple rule of thumb (tune -> forking; fit -> thread)?

topepo commented 3 years ago

That's really interesting! TBH I',m surprised that a combination like that works at all. Can you make a plot of the x-axis as the speed-up (seq time/par time)?

I might run some of these too locally this weekend.

Athospd commented 3 years ago

@topepo I'm running more benchmarks here and I think I spotted a potential issue you might want to check yourself to confirm: when I set vfold_cv(v = 3) only 3 workers were used even with tune_grid() set to fit lots of different models. And when I set to vfold_cv(v = 8) I watched all my 8 cores 100%. My hypothesis is that tune_grid() is forking only on the folds loop.

gregleleu commented 3 years ago

Hi, I'm using doFuture/doRNG parallel processing for my tidymodels workflows (for tuning), with other engines (apparently I need to load doFuture before using doRNG, but I'm still trying to check that):

library(doFuture)
registerDoFuture()
plan(multisession)

doRNG::registerDoRNG()

It fails when using treesnip with catboost. I get an error: Error in pkg_list[[1]]: subscript out of bounds. This is because catboost and treesnip are not loaded on the workers (I can't fork because of Rstudio, and there is a consensus you shouldn't fork from Rstudio). It works when I "register" the dependencies manually (see https://github.com/tidymodels/tune/issues/205):

set_dependency("boost_tree", eng = "catboost", "catboost")
set_dependency("boost_tree", eng = "catboost", "treesnip")

It could useful to either document that somewhere for people or maybe there is a place where you can include the set_dependency commands.