statdivlab / corncob

Count Regression for Correlated Observations with the Beta-binomial
101 stars 21 forks source link

parallel processing #114

Open Midnighter opened 3 years ago

Midnighter commented 3 years ago

It seems to me that running differentialTest is a perfectly parallel problem. For my current project, the time taken is too short to just start an analysis in the background, forget about it, and return to look at the results at a later time but it is too long to make interactive work really pleasant.

As far as I can tell, it is a single for loop running all the models. What do you think about parallelizing that, for example, foreach could be a quick replacement?

bryandmartin commented 3 years ago

Another great idea, thanks @Midnighter .

Here's a question for you as someone interested in this feature. Would you prefer:

Basically, I'm wondering which of these designs you think would be more intuitive. If anyone else happens to see this before I implement, feel free to offer your opinion as well.

Midnighter commented 3 years ago

I definitely prefer another parameter on the existing function. I realize this will require some internal restructuring (maybe an opportunity to refactor some of the code).

If you do use foreach to implement this, another option is to let the user create the backend and decide what to run based on that. Similar to the following pseudo code.

library(foreach)
library(doParallel)

# user-defined
registerDoParallel(3)

# within differentialTest
if (getDoParRegistered() & getDoParWorkers() > 1) {
  foreach(...) %dopar% {
  }
} else {
  foreach(...) %do% {
  }
}
bryandmartin commented 3 years ago

Good stuff, thanks a ton! I'll implement it that way.

Midnighter commented 3 years ago

If you were inclined to rework your code to process data with dplyr and purrr rather than for loops, there is also furrr offering some parallel implementations of purrr functions. Especially for model fitting there are many examples out there but I guess it would mean almost a complete rewrite of corncob.

cdiener commented 3 years ago

One gotcha here that we encountered is that if you have a parallel BLAS backend (like MKL or OpenBLAS) a lot of the linear algebra will already run in parallel. If you spawn too many processes this will often easily choke, so you need to set "OMP_NUM_THREADS" or equivalent for that to be efficient. Also mclapply might be enough here and requires no additional dependencies.