josdejong / workerpool

Offload tasks to a pool of workers on node.js and in the browser
Apache License 2.0
2.04k stars 148 forks source link

Note on usage with Promise.all() #443

Open TimUnderhay opened 4 months ago

TimUnderhay commented 4 months ago

Hello M. de Jong,

Though not a request for support, per se, it seemed it might be helpful to pass along my experience, in case you thought it might not be bad to add something to the documentation.

I am implementing workerpool on Node.js for some particularly CPU intensive tasks on millions of data points. At first, it seemed to me calling exec() or proxy() was considerably slower than running the tasks on the main thread. When I debugged, I saw that my multiple calls to exec() or proxy() were only being invoked within the worker thread at a rate of around 10 per second (eyeballing it), which is obviously not very good.

It then occurred to me that I was calling these from within a Promise.all(). E.g.

Promise.all(
  data.map( // data is array with millions of data points
    (someVar) => pool.exec('someCommand', args)
  )
)

Once I switched to a standard for-loop, the rate of function invocation within the worker thread increased to about where I'd expect it to be (8-10K / sec), which is obviously orders of magnitude faster than with Promise.all().

The moral of the story is that invoking exec() or proxy() that many times in quick succession fantastically kills the performance, for reasons not entirely clear to me. Much better to use it with standard loops. Or in my case, it would probably be better yet to pass data to the worker in batches rather than many single-value invocations.

Thank you for the library!

josdejong commented 4 months ago

Thanks for sharing.

I'm not sure why using Promise.add would slow down execution 🤔. Please let me know if anyone is interested to dive into this.