HenrikBengtsson / future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
https://future.futureverse.org
956 stars 83 forks source link

Optimizing future for a noob #464

Closed dlmatera closed 3 years ago

dlmatera commented 3 years ago

I am trying to integrate 40GB of data via the Seurat package & have started using the future package (As recommended by Seurat devs for large datasets) and have a few questions:

  1. I am currently using a cluster that has up to 30 cores/node - I understand it isnt 30X faster to use 30 cores (and it is also much more expensive to do that) - given that there may be a diminishing return effect, is there any way to find out an optimal number of cores to use?
  2. I am using Rstudio (can only use Multisession and not multicore/forking) - is there a significant disadvantage to doing this? Should I expect faster results with multicore?
HenrikBengtsson commented 3 years ago

Hi. Could you please re-open this as 'Discussion' at https://github.com/HenrikBengtsson/future/discussions? Not sure it'll work out, but I'm hoping to get to a point where 'Issues' is used only for bugs and feature requests. Thxs

HenrikBengtsson commented 3 years ago

Closing since this is now in https://github.com/HenrikBengtsson/future/discussions/466. Thxs for moving it there.