josdejong / workerpool

Offload tasks to a pool of workers on node.js and in the browser
Apache License 2.0
2.1k stars 146 forks source link

Implement parallel Map & Reduce #40

Open aysyal31 opened 6 years ago

aysyal31 commented 6 years ago

Hi Team ,

We are using workerpool for executing certain functions for an array of items.

As of current implementation, the full array is passed onto all the workers for processing. Ideally we would want a map & reduce functionality where the array is distributed among the workers in chunks and the workers upon completion return the worked upon array.

Any direction on helping achieve that would be great. We can open for any contributions to the code if reqd.

josdejong commented 6 years ago

Parallel execution of operations on arrays would be great. This is even one of the bullets in the roadmap:

https://github.com/josdejong/workerpool#roadmap

Help implementing this would be very welcome!

I haven't thought about how something like this can be implemented efficiently. Do you have any ideas already?

I guess we have to create a sort of an "ArrayTask" where workers can pick a single item+task from (without removing the task itself from the queue), report the results back, and only when all items are processed, run the callback of the ArrayTask with the results and remove the ArrayTask from the queue.

joeyrobert commented 5 years ago

I wrote this snippet recently leveraging worker pool, maybe it could be of use:

function mapParallel(func, items) {
    var pool = workerpool.pool();
    var promises = items.map(item => pool.exec(func, [item]));
    return Promise.all(promises);
}
josdejong commented 5 years ago

Thanks for sharing Joey. I see inside the function you create a new pool. You probably want to destroy it after all promises are done to free up resources, or instead you could create a singleton instance of a workerpool.