Slow message passing between main and worker (4x slower than Node)

mpizenberg commented 3 years ago

For a basic ping-pong like sending and receiving in succession 100_000 messages, Node is taking 1s and Deno is taking 4s for me. I was not expecting that much difference. Here is basically the content of worker.mjs:

// deno_worker.mjs
self.onmessage = (msg) => {
  self.postMessage(msg.data);
};

And here is the content of main.mjs:

// deno_main.mjs
let msgCount = 100000;
console.log("Sending", msgCount, " messages ...");

let worker = new Worker(new URL("./deno_worker.mjs", import.meta.url).href, {
  type: "module",
});

worker.onmessage = (msg) => {
  if (msg.data >= msgCount) {
    worker.terminate();
    Deno.exit();
  } else {
    worker.postMessage(msg.data + 1);
  }
};

worker.postMessage(0);

You can find in that gist the exact files I'm using. I'm running the deno code and the node code with respectively:

time deno run --allow-read deno_main.mjs
time node node_main.mjs

My system:

archlinux, kernel 5.13.4
node v14.15.5
deno 1.12.1

nhrones commented 3 years ago

I've observed the same comparing deno to chrome. 25ms round trip deno vs 2ms in chrome.

////////////////// Chrome Worker Test \\\\\\\\\\\\ ///////////// worker.ts \\\\\\\\\

self.onmessage = (params) => {
    ({id, start, workerText } = params.data)
    workerText = `ts: ${(Date.now() - start).toString().padStart(3,"0")} ms  in worker${id}`
    postMessage({id, start, workerText })
    self.close()
}

///////////// index.html \\\\\\\\\

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <title>Fetch</title>
</head>

<body>
    <h1>test</h1>
    <pre id=log>logger...</pre>
</body>
<script>
    let logger = null
    const init = (el) => logger = el
    const log = (that, ui = true) => {
        if (ui) logger.textContent += that + `\n`
        else console.log(that)
    }
    let start = 0
    let workerText = ''
    const test = () => {
        start = Date.now()
        log('\n************* start of main ***************')
        for (let idx = 1; idx < 11; idx++) {
            log(`${(Date.now() - start).toString().padStart(3, "0")} ms, created worker${idx}`)
            const worker = new Worker("./worker.js");
            worker.addEventListener('message', message => {
                workerText += message.data.workerText
                log(`${(Date.now() - start).toString().padStart(3, "0")} ms, recieved <<"${message.data.workerText}" from worker${message.data.id}>>`)
            })            
            worker.postMessage({ id: idx, start: start, workerText: workerText });

            queueMicrotask(() => {
                log(`${(Date.now() - start).toString().padStart(3, "0")} ms, executing queued microtask#${idx}`)
            });
            log(`${(Date.now() - start).toString().padStart(3, "0")} ms, queued microtask#${idx}`)
        }
        log('workerText', workerText)
        log('************** end of main ***************')
        log
    }

    window.onload = () => {
        init(document.getElementById("log"))
        test()
    }
</script>

</html>

////////////// result \\\\\\\\\\\ * start of main *** 000 ms, created worker1 000 ms, queued microtask#1 000 ms, created worker2 000 ms, queued microtask#2 000 ms, created worker3 001 ms, queued microtask#3 001 ms, created worker4 001 ms, queued microtask#4 001 ms, created worker5 001 ms, queued microtask#5 001 ms, created worker6 001 ms, queued microtask#6 001 ms, created worker7 001 ms, queued microtask#7 001 ms, created worker8 002 ms, queued microtask#8 002 ms, created worker9 002 ms, queued microtask#9 002 ms, created worker10 003 ms, queued microtask#10 ** end of main *** 003 ms, executing queued microtask#1 003 ms, executing queued microtask#2 003 ms, executing queued microtask#3 003 ms, executing queued microtask#4 004 ms, executing queued microtask#5 004 ms, executing queued microtask#6 004 ms, executing queued microtask#7 006 ms, executing queued microtask#8 006 ms, executing queued microtask#9 006 ms, executing queued microtask#10 040 ms, recieved <<"ts: 040 ms in worker1" from worker1>> 056 ms, recieved <<"ts: 054 ms in worker2" from worker2>> 063 ms, recieved <<"ts: 061 ms in worker3" from worker3>> 072 ms, recieved <<"ts: 071 ms in worker4" from worker4>> 082 ms, recieved <<"ts: 082 ms in worker6" from worker6>> 101 ms, recieved <<"ts: 099 ms in worker5" from worker5>> 103 ms, recieved <<"ts: 099 ms in worker7" from worker7>> 112 ms, recieved <<"ts: 111 ms in worker8" from worker8>> 118 ms, recieved <<"ts: 118 ms in worker9" from worker9>> 123 ms, recieved <<"ts: 122 ms in worker10" from worker10>>

////////////////// DENO Worker test \\\\\\\\\\\\\\

//////////////// worker.js \\\\\\\\\\\

self.onmessage = (params) => {
    let { id, text, start } = params.data
    console.log(`    ${Date.now()-start} in worker${id} relaying, ${JSON.stringify(text)}`)
    text = 'text2'
    self.postMessage({id: id, text: text})
    self.close
}

//////////////// main.js \\\\\\\\\\\

let start = 0
start = Date.now()
console.log('\n************* start of main ***************')
for (let idx = 1; idx < 10; idx++) {   
    let worker = new Worker(new URL("./worker.js", import.meta.url).href, { type: "module" })
    worker.addEventListener('message', message => {
        let text = message.data.text
        let id = message.data.id
        console.log(`${Date.now() - start} recieved "${text}" from worker${id}`)
    })
    console.log(`${Date.now() - start} created worker${idx}`)
    worker.postMessage({ id: idx, start: start, text: 'text1'});
    queueMicrotask(() => {
        console.log(`${Date.now() - start} executing queued microtask#${idx}`)
    });
}
console.log('************** end of main ***************')

////////////////// deno result \\\\\\\\\\\\

************* start of main ***************
32 created worker1
    37 in worker1 relaying, "text1"
60 created worker2
    66 in worker2 relaying, "text1"
89 created worker3
    97 in worker3 relaying, "text1"
126 created worker4
    132 in worker4 relaying, "text1"
155 created worker5
    161 in worker5 relaying, "text1"
184 created worker6
    191 in worker6 relaying, "text1"
216 created worker7
    223 in worker7 relaying, "text1"
249 created worker8
    256 in worker8 relaying, "text1"
281 created worker9
************** end of main ***************
282 executing queued microtask#1
283 executing queued microtask#2
283 executing queued microtask#3
284 executing queued microtask#4
285 executing queued microtask#5
286 executing queued microtask#6
287 executing queued microtask#7
288 executing queued microtask#8
290 executing queued microtask#9
    292 in worker9 relaying, "text1"
294 recieved "text2" from worker1
295 recieved "text2" from worker2
296 recieved "text2" from worker3
296 recieved "text2" from worker4
297 recieved "text2" from worker5
299 recieved "text2" from worker6
300 recieved "text2" from worker7
302 recieved "text2" from worker8
304 recieved "text2" from worker9

F3n67u commented 3 years ago

Can reproduce on my computer.

 % time deno run --allow-read deno_main.mjs 
time node node_main.mjs
Sending 100000  messages ...
deno run --allow-read deno_main.mjs  5.79s user 1.08s system 139% cpu 4.913 total
Sending 100000  messages ...
node node_main.mjs  1.46s user 1.13s system 112% cpu 2.311 total

System Info: macOs 11.2.3 deno 1.12.2 node v15.5.1

F3n67u commented 3 years ago

For a basic ping-pong like sending and receiving in succession 100_000 messages, Node is taking 1s and Deno is taking 4s for me. I was not expecting that much difference. Here is basically the content of worker.mjs:
// deno_worker.mjs
self.onmessage = (msg) => {
  self.postMessage(msg.data);
};
And here is the content of main.mjs:
// deno_main.mjs
let msgCount = 100000;
console.log("Sending", msgCount, " messages ...");

let worker = new Worker(new URL("./deno_worker.mjs", import.meta.url).href, {
  type: "module",
});

worker.onmessage = (msg) => {
  if (msg.data >= msgCount) {
    worker.terminate();
    Deno.exit();
  } else {
    worker.postMessage(msg.data + 1);
  }
};

worker.postMessage(0);
You can find in that gist the exact files I'm using. I'm running the deno code and the node code with respectively:
time deno run run --allow-read deno_main.mjs
time node node_main.mjs
My system:

archlinux, kernel 5.13.4

node v14.15.5

deno 1.12.1

BTW, there is a redundant run: time deno run run --allow-read deno_main.mjs -> time deno run --allow-read deno_main.mjs

mpizenberg commented 3 years ago

BTW, there is a redundant run: time deno run run --allow-read deno_main.mjs -> time deno run --allow-read deno_main.mjs

woops thanks @F3n67u , I fixed the comment.

bartlomieju commented 3 years ago

@andreubotella any ideas what might be causing such slow downs?

lucacasonato commented 3 years ago

@bartlomieju We do at minimum 4 v8-js hops for a transfer:

Deno.core.serialize (structured serialize)
op_postmessage (which does a byte array copy internally, we need a serde_v8::OwnedBuf to optimize)
op_message_port_recv_message (instantiates a 0-copy from an existing Vec, essentially optimal)
Deno.core.deserialize (structured deserialize)

I think the slowdown is a combo of the byte array copy in op_postmessage and Deno.core.serialize and Deno.core.deserialize being seperate calls.

I will run some benchmarks and flame graph on this once #11572 lands.

mpizenberg commented 3 years ago

Hi @lucacasonato did you have any time to look at this again after 11572?

lucacasonato commented 3 years ago

I have not managed to take a look at this yet. It is still on my back-burner though.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

mpizenberg commented 3 years ago

Hey stale bot ^^. I don't think this is actually stale, but rather taking some time. I'll let @lucacasonato chime in regarding the state of this.

ancoron commented 1 month ago

Just out of curiosity: does it make a difference if I'm sending a Uint8Array message vs. an object?

mpizenberg commented 1 month ago

Hey, to be honest I just abandoned trying to look at that/fix that considering the lack of interest for my issue by the deno project. So I won’t look more into this.

ancoron commented 1 month ago

Hehe, OTOH the internal copy done by op_postmessage can be (ab)used to avoid having to copy in JS client code when passing larger messages.

@lucacasonato Any items on the roadmap in the foreseeable future to optimize the op_postmessage?

I am currently evaluating some options to implement a faster logging, borrowing ideas from pino using a dedicated worker to handle the actual output. This trick really seems to speed things up a lot when using a simple log buffer before.

denoland / deno

Slow message passing between main and worker (4x slower than Node) #11561