andywer / threads.js

🧵 Make web workers & worker threads as simple as a function call.
https://threads.js.org/
MIT License
3.04k stars 161 forks source link

Is using a worker method better for performance? #454

Closed rigwild closed 5 months ago

rigwild commented 1 year ago

Hi, thanks for this lib, I like it a lot and use it in lots of projects when performance is needed :)

I wanted to ask some clarification, does using a worker-exposed method make execution faster for CPU-intensive tasks? Or is it all executed in the worker and the performance is identical?

For example, let's say the multiplier function below is doing tons of maths operations internally, is calling it from a worker method is more performant?

Is:

// workers/multiplier.js
import { expose } from "threads"

const multiplier = (a, b) => new Promise(resolve => setTimeout(() => resolve(a * b), 100))

expose({ multiplier })

// index.js
import { spawn, Pool, Worker } from "threads"

const pool = Pool(() => spawn(new Worker("./workers/multiplier")))

for (let i = 0; i < 5464; i++) {
    pool.queue(async ({ multiplier }) => {
      const multiplied = await multiplier(2, 3)
      console.log(`2 * 3 = ${multiplied}`)
    })
}

await pool.completed()
await pool.terminate()

Faster than:

// workers/empty.js
import { expose } from "threads"
expose({})

// index.js
import { spawn, Pool, Worker } from "threads"

const pool = Pool(() => spawn(new Worker("./workers/empty")))
const multiplier = (a, b) => new Promise(resolve => setTimeout(() => resolve(a * b), 100))

for (let i = 0; i < 5464; i++) {
    pool.queue(async () => {
      const multiplied = await multiplier(2, 3)
      console.log(`2 * 3 = ${multiplied}`)
    })
}

await pool.completed()
await pool.terminate()

If the performance is the same, I would suggest to allow to not pass a path to the Worker constructor, like:

const pool = Pool(() => spawn(new Worker()))

Thanks!

Maximvdw commented 1 year ago

The worker is a separate "thread" that executes Javascript and communicates to and from the master. There is overhead involved for the communication, so for small tasks or tasks with a lot of IO you will get worse performance. For the task you describe you can get better performance because it's a mathematical operation and I assume the IO is limited to numbers. The path is needed because it executes that file.

In general you should check out web workers and look at thread.js as a wrapper around this that removes the need to create your own communication layer