Parallel processing of 100% simulations

bensutherland commented 4 years ago

Hello Eric and developers, Would it be possible to develop parallel processing of assess_reference_loo() ? This would greatly speed up 100% simulations when dealing with large reference baselines. Perhaps there is a reason why this is not possible, but I thought I would check. Many thanks and all the best, Ben

eriqande commented 4 years ago

Hi Ben,

The natural solution would be to swap out a couple of lapply()s with some mclapply()s within that function. On Macs and Linux that would allow forking to compute different iterations in parallel. Windows does not support forking, apparently, so no speedup would be seen on Windows. Are you running this on a Windows system, or do you have a Linux box you are running them on?

bensutherland commented 4 years ago

Thank you Eric. Yes indeed we are using Linux as well, so this would certainly help us out, especially with some of our larger baselines that we want to run. My last run was around 35 hrs, so this would be super to get a speed up from parallel using mclapply() as you describe.

eriqande commented 4 years ago

Wow ben! 35 hours is intense. Could you send me the code you are using to do that run? I just want to see how many different scenarios are being explored and what types of options are being used, so I know where to tailor some of the parallelization.

eric

On Wed, May 20, 2020 at 5:34 PM Ben Sutherland notifications@github.com wrote:

Thank you Eric. Yes indeed we are using Linux as well, so this would certainly help us out, especially with some of our larger baselines that we want to run. My last run was around 35 hrs, so this would be super to get a speed up from parallel using mclapply() as you describe.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eriqande/rubias/issues/27#issuecomment-631792372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQ4JWISLAHZ3YSIYZASX3RSRSHNANCNFSM4NFNGQFQ .

bensutherland commented 4 years ago

Hi Eric, Thanks again for your help. I can send the full code and baseline file if that will help, but for simplicity, I strongly believe this is the step that takes the compute time:

all_collection_results <- assess_reference_loo(reference = rubias_base
                                                 , gen_start_col = (which(colnames(rubias_base)=="indiv") + 1)
                                                 , reps = 100
                                                 , mixsize = 200
                                                 , alpha_collection = all_collection_scenario
  )
# note: alpha_collection is a named list of requested 1.0 proportions for each collection in the baseline
# note: rubias_base is a rubias baseline file with 425 populations and 31574 individuals

...after which the standard output to the R console is:

++++ Starting in on repunit_scenario 1 with collection scenario Tatchun_R ++++
Doing LOO simulations rep 1 of 100
Doing LOO simulations rep 2 of 100
Doing LOO simulations rep 3 of 100
Doing LOO simulations rep 4 of 100

...followed by a repunit scenario for each of the 425 collections, and this is what takes the compute time. Does that indicate the part of the code that could be parallelized? Please let me know if you would prefer that I send the full code and input file and I will send your way.

Thanks very much! Ben

eriqande commented 4 years ago

Thanks Ben, That is super helpful.

I did a few tests, and it is better to parallelize it over the different "100% scenarios" than to parallelize over the iterations of any particular scenario.

On a Linux node with 20 cores, I saw a 13.6X speedup. That is not bad. It would bring your 35 hour problem down to 2.6 or so.

I've developed this in a separate branch. You can get it with:

devtools::install_github("eriqande/rubias", ref = "mclapply-assess-reference-loo")

Although, I guess that is more appropriately done thus:

remotes::install_github("eriqande/rubias", ref = "mclapply-assess-reference-loo")

Now that the remote repos parts of devtools were spun off into the remotes package.

I'm attaching an R notebook (as a PDF file) with an example. Pretty much, if you want to use 20 cores on your computer, you use something like:

assess_reference_loo(
  reference = chinook,
  gen_start_col = 5,
  reps = 80,
  mixsize = 50,
  alpha_collection = hundy_coll_list,
  mc.cores = 20  
)

The only change is the addition of mc.cores = 20 in there.

test-mclapply-on-cluster.pdf

eriqande commented 4 years ago

Hey Ben, I just put a new branch up. See the previous comment. Should be able to reduce your time a fair bit.

eric

On Thu, May 21, 2020 at 3:57 PM Ben Sutherland notifications@github.com wrote:

Hi Eric, Thanks again for your help. I can send the full code and baseline file if that will help, but for simplicity, I strongly believe this is the step that takes the compute time:

all_collection_results <- assess_reference_loo(reference = rubias_base , gen_start_col = (which(colnames(rubias_base)=="indiv") + 1) , reps = 100 , mixsize = 200 , alpha_collection = all_collection_scenario )

note: alpha_collection is a named list of requested 1.0 proportions for each collection in the baseline

note: rubias_base is a rubias baseline file with 425 populations and 31574 individuals

...after which the standard output to the R console is:

++++ Starting in on repunit_scenario 1 with collection scenario Tatchun_R ++++ Doing LOO simulations rep 1 of 100 Doing LOO simulations rep 2 of 100 Doing LOO simulations rep 3 of 100 Doing LOO simulations rep 4 of 100

...followed by a repunit scenario for each of the 425 collections, and this is what takes the compute time. Does that indicate the part of the code that could be parallelized? Please let me know if you would prefer that I send the full code and input file and I will send your way.

Thanks very much! Ben

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eriqande/rubias/issues/27#issuecomment-632367098, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQ4JW62UDVW5JEX3V57U3RSWPTJANCNFSM4NFNGQFQ .

bensutherland commented 4 years ago

Hi Eric, This works great and has drastically reduced our run times. Thank you for your efforts on this! All the best, Ben

eriqande commented 4 years ago

Just dropping in here to say that I need to pull mclapply-assess-reference-loo into master at some point.

eriqande / rubias

Parallel processing of 100% simulations #27

note: alpha_collection is a named list of requested 1.0 proportions for each collection in the baseline

note: rubias_base is a rubias baseline file with 425 populations and 31574 individuals