GoogleChromeLabs / comlink

Comlink makes WebWorkers enjoyable.
Apache License 2.0
11.3k stars 386 forks source link

Worker Pool for similar tasks #657

Open QuisMagni opened 7 months ago

QuisMagni commented 7 months ago

I would like to use a worker pool to delegate tasks to multiple workers with the same interface.

Without Comlink: When there is a new task to be done, the pool selects a worker that is currently idle and assigns the task to it. For the time a worker is processing a task i mark the worker as "busy" - if the work is done i mark the worker as idle. Its working fine but i need to write my own message protocol ending up writing long switch-case statements

With Comlink: The idea is to request a "free" worker from the worker pool (using a Promise), execute an async function, and process the result in the form of a return value. I can mark the requested worker as "busy" but i have problems mark is as "idle" after returning the value. Unfortunately, I have not found a way to determine whether or how many "messages" are currently being processed by a worker. So, whether a function is currently being executed via Comlink.

Do you have any idea on how I could solve the problem?

QuisMagni commented 6 months ago

Just in case somebody else is interested - i ended up using the following concept:

WP - Worker-Pool (n comlink worker instances)

  1. Request side: Query WP for a - not busy - worker instance and get a promise for the worker
  2. WP: marks a - not busy - worker as busy and resolve the promise
  3. Request side: do what ever you want with the worker
  4. Request side: if your work is done call a release method at the WP to remove the "busy" flag from the worker

This is a somewhat cumbersome way but aslong as the worker ist getting released back to the pool at the end - its working.

If there is a need for heavy parallel processing and you have a bunch of tasks you can easily use a n-sized worker pool and make sure every available thread is doing as much work as possible (instead of using a distribution algo like round robin)

Raolibec commented 3 months ago

I have developed a worker pool (WP) functionality. This WP is relatively simple because my requirements do not involve coordinating multiple WPs.

The main functional design of this WP is as follows:

  1. Using web workers through comlink.
  2. There is only one worker.js file, which can execute as much business logic as possible. This ensures the protection of the complete code design structure on the basis of comlink.
  3. There is only one asynchronous execution method in worker.js, which is exposed via comlink.expose.
  4. Simulate a synchronous worker to prevent the WP from becoming unusable if the browser does not support workers. In the case of a fatal error, it can fall back to synchronous execution.
  5. Maintain idle and busy arrays and a task queue. Change the worker status via promises. After a worker has been idle for 1.5 seconds, check its status again and automatically close and release resources if it is still idle.
  6. The maximum number of threads can be controlled by modifying a number.
  7. The worker pool is further encapsulated to minimize the negative impact of web workers on the complexity of the business logic, exposing only the parallel_process method externally.

The WP has the following shortcomings:

  1. This WP creates many promises to handle asynchronous operations and intermediate steps. Even your business logic must forcibly return a promise object. If a promise chain is not handled correctly somewhere (e.g., forgetting to return a promise or incomplete error handling), it may cause a thread to get stuck or the main business logic to get stuck. Although this issue can be mitigated by forcibly rejecting with a timeout, I have not implemented this.

  2. In the check_task_queue method, it checks the task queue and the idle worker pool and assigns tasks to idle workers. However, if a worker processes tasks very quickly (or very slowly), it may lead to unbalanced task assignment, especially when the task arrival rate is uneven. I am currently trying to resolve this issue.

I have not yet thought too deeply about other issues. I hope this is useful to you.

WorkerPool.zip

codes in main.js

import parallel_process from '../worker'

'const abc = async function(param1,param2,...){ ..... return ......}' const result = await parallel_process(abc,[param1,param2,...])