Efficient Atomics.microwait() needs a wasm opcode?

juj commented 6 months ago

Reading the linked Intel manual, it says

It refers to latency, though I understand this effectively means that the cpu core will microsleep for about 140 clock cycles.

The documentation also has this

That states that if sleep should exceed order of ~1000s of cycles, then OS level primitive would be needed.

This raises a thought that there would be a want for a symmetric Atomics.microwait() Wasm opcode that would translate to that instruction. Since if Wasm code naively called out from Wasm to JS to invoke Atomics.microwait(), I'd expect such language crossing marshalling could easily cost in the order of tens of thousands of instructions or more (with all the JS typed object semantics, parameter passing, security checks and JIT profiling etc.) taking place.

Would adding a Wasm opcode live under a separate spec, or would it be part of this one?

CC @dschuff @tlively to put this thought on your radar (if it hasn't been already).

tlively commented 6 months ago

Would adding a Wasm opcode live under a separate spec, or would it be part of this one?

We would need a companion proposal in the Wasm CG to add a microwait instruction, but I expect that to be relatively straightforward. All of the interesting discussion can happen here.

syg commented 6 months ago

Relatedly, it's also reasonable that inlining Atomics.microwait() inside of optimized loops will be a completely different codegen than unoptimized calls.

tlively commented 4 months ago

The WebAssembly shared-everything threads proposal now includes a pause instruction covering this use case. https://github.com/WebAssembly/shared-everything-threads/pull/54

tc39 / proposal-atomics-microwait

Efficient Atomics.microwait() needs a wasm opcode? #2