numbats / numbathack18

Repository for the NUMBAT local hackathon 2018
0 stars 0 forks source link

Parallel programming with a resizable thread pool #10

Open rdpeng opened 6 years ago

rdpeng commented 6 years ago

This is something I've been trying to get going for a long time, so naturally I expect to finish it in two days! The idea is to build a package that allows you to do parallel computations using a dynamically resizable thread pool. Essentially, it's mclapply() but with a different backend.

Typically, with parallel backends you have a fixed set of resources (processors) and you take the number of jobs and divide them more-or-less equally amongst the resources. If the jobs are all very similar and take the same amount of time, this approach works well. If there is heterogeneity in the jobs, then you might want to have a queue of jobs and then send them off to the processors as jobs finish (this is what mc.preschedule = FALSE does in mclapply()).

I'd like to build a system for parallel jobs that have two properties

  1. the jobs are highly heterogeneous in how long they take
  2. the jobs are slow (take a long time to individually complete)

In addition, I'd like for it to work in a cluster environment where there may be many different physical computers part of one system.

The advantage of this approach is that you could reallocate resources on the fly if the overall process is taking too long or is fast. So if a process starts with 2 processors, but later you decide you want to dedicate 8 processors to it, you could simply add the additional 6 processors to the pool as the job is running and the backend would send out more jobs to those processors.

In support of this I've written the queue package as a data structure for communicating between processes, but I haven't gotten much further than that.

rdpeng commented 6 years ago

Here is the repo where I’ve begun some work: https://github.com/rdpeng/rwmparallel