Open colepoirier opened 2 years ago
For the ECS, there are two relevant bits:
Another thing to be aware of is that WebAssembly memory objects that are shared cannot be resized. They must declare an "initial" and "maximum" size.
There's good discussion of some of the quirks that introduces in this thread: https://github.com/WebAssembly/design/issues/1397
Another thing to be aware of is that WebAssembly memory objects that are shared cannot be resized. They must declare an "initial" and "maximum" size.
There's good discussion of some of the quirks that introduces in this thread: WebAssembly/design#1397
Thanks for the heads up! Oh boy is that ever a can of worms; I think I’ll defer the wasm memory stuff, keeping this as only a wasm multithreading MVP, and hopefully let someone else come up with a strategy for dealing with that. I will definitely add it to #4279, as this is a pretty big thing to keep in mind and investigate as we work on bevy’s Web UX story.
Since SharedArrayBuffer requires some cors headers, I made a replacement for basic-http-server that allows setting headers. Might find it useful here for testing and running the examples when work on this resumes. https://crates.io/crates/http-serve-folder
Example with the headers needed for SharedArrayBuffer:
cargo install http-serve-folder
http-serve-folder --header "Cross-Origin-Opener-Policy: same-origin" --header "Cross-Origin-Embedder-Policy: require-corp" wasm/
This is really nice, thanks for sharing it here!
What's the state on this? Has there been progress since it was put on hold half a year ago?
No real progress. No one is too motivated to do anything, since the memory model for shared array buffer is going to make it very hard to work with the ecs.
Reading through the discussion from unity devs, linked above it seems that the issue is mainly a blocker for mobile devices.
Also hopeful is that the issue had some progress only 2 weeks ago, with from my limited understanding seems to be some kind of wasm equivalent of 'free()'. This means it should be possible to resize threads?
From how I am reading it, there really isn't much blocking multithreading wasm for desktop?
I spent the last few hours and wrote some very sloppy code that shows a few key areas Bevy needs to change to get Wasm multithreading support: https://github.com/kettle11/bevy/commit/c8c2eb51a872f18ede40bbef7055d9b45b29acb6
The code spawns web workers instead of threads and appears to almost work. Sometimes it will run for a few moments with multiple threads and successfully not crash! These issues prevent it from fully working:
cpal
would also need to be updated because the above commit crashes on some cpal code. I commented out the generated JS that was crashing while testing.
The next issue, which I'm not sure how to resolve, is that Bevy crashes pretty quickly due to async-executor
attempting to lock the main thread. That's forbidden in the browser. Something in the design of Bevy or async-executor
would need to be changed.
I'm sure there are other issues once those two are resolved. That said, in my opinion there's no hard technical barrier preventing Bevy from being multi-threaded on web.
Snippets of the above code were taken from this blog-post: https://www.tweag.io/blog/2022-11-24-wasm-threads-and-messages/
The second issue that you found could be worked around by running Bevy on a web worker itself, using OffscreenCanvas for the rendering and using postMessage to forward keyboard and mouse events from the main event loop.
The second issue that you found could be worked around by running Bevy on a web worker itself, using OffscreenCanvas for the rendering and using postMessage to forward keyboard and mouse events from the main event loop.
Would adding that overhead to input be noticeable?
There is a good write up here about the performance of postMessage
: https://surma.dev/things/is-postmessage-slow/
It depends on the size of the payload, but it seems anything up to 10kb would take less than a millisecond.
I assume that less than a millisecond is acceptable for latency of input, but yeah its another addition to latency. Latency in input today is horrible comparet to the days before USB
So percentage and usagewise I doubt it will be very noticeable at all :)
I tried to put the entire bevy app in a web worker, to try and solve the issue with async-executor. But then I ran into issues due to what I guess are Winit trying to things to document
that isnt allowed from within a web-worker.
So maybe trying to put async-executor into a web worker, could be a feasible next solution.
also since there are some server headers required for web worker functionality, as in @kettle11 run script. And i could not get his devserver to work ( some kind of dependecy issue ) I also rolled my own bloated devserv based on rocket, can be found here: https://github.com/TotalKrill/devserv
Rebased @kettle11 work to see if anything had changed with all the changes in bevy main. It doesnt crash, but it doesnt do anything either after initializing all the workers either. Could not see anything in the logs either...
Heres the rebased branch if anyone is curious: https://github.com/TotalKrill/bevy/tree/wasm_multithread
I agree with https://github.com/bevyengine/bevy/issues/4078#issuecomment-1472634600. I don't know whether resizing shared memory was truly a problem in 2021, but it certainly isn't now. Shared memory has an initial size, a maximum size, and is growable up to the maximum. There are zero functional limitations compared to non-shared memory, and the API is almost identical except that shared memory must have a maximum size set. Any single-threaded Bevy/wasm app that currently exists already has a maximum size. Wasm-bindgen sets one unconditionally, with a default maximum of 1GiB and an absolute max of 4GiB for wasm32 of course. (I detail how to change the maximum below.) The only practical difference is when (not if) you see an error when you try to set the maximum very high in a 32-bit browser. More details below if you want to learn more.
The only thing missing from the story in 2021 would have been browser support. Safari implemented shared memory in late 2021, and as of that moment, all major browsers do (with the HTTP headers of course).
The things to be concerned about remain spinning up workers at all, avoiding locking the main thread, the other things listed in wasm-bindgen's section on the caveats, and any other DOM objects or Web APIs that can't be used from a web worker (like HtmlCanvasElement directly instead of via OffscreenCanvas). Regarding canvases specifically, winit 0.29 beta supports the main thread + web workers scenario (https://github.com/rust-windowing/winit/pull/2778 and https://github.com/rust-windowing/winit/pull/2834) and Bevy will be able to take advantage of that if it can ship an OffscreenCanvas to a renderer web worker.
In summary, Bevy multithreading on wasm is probably much closer than it has seemed.
Wgpu/bevy's renderer isn't threadsafe on wasm (wgpu just used to lie about it before wgpu 0.17 because wasm threading wasn't really a thing and it's threadsafe on native, just not on wasm).
If you want to test if your multithreading actually works with the renderer on wasm you need to remove the "fragile-send-sync-non-atomic-wasm"
feature here https://github.com/bevyengine/bevy/blob/de8a6007b7df5bd961511cd321344157fb4b531f/crates/bevy_render/Cargo.toml#L63 fix all the errors (without breaking threading on native backends), then see if it works.
It sounds like it might be possible to run the renderer in a web worker and have it work (as long as you pin it to that webworker and don't reference it's resources from other threads)?
edit: Tracking issue for renderer https://github.com/bevyengine/bevy/issues/9304
With https://github.com/bevyengine/bevy/pull/12205 a first step towards getting Bevy multithreaded on web has been merged. Now it is possible to build a Bevy project with multithreading enabled, even if Bevy internals are not yet multithreaded.
A short guide for how to try it out yourself:
set -e
RUSTFLAGS='-C target-feature=+atomics,+bulk-memory' \
cargo build --example breakout --target wasm32-unknown-unknown -Z build-std=std,panic_abort --release
wasm-bindgen --out-name wasm_example \
--out-dir examples/wasm/target \
--target web target/wasm32-unknown-unknown/release/examples/breakout.wasm
devserver
specifically is not required. Any server that can set these CORS
headers can be used. devserver --header Cross-Origin-Opener-Policy='same-origin' --header Cross-Origin-Embedder-Policy='require-corp' --path examples/wasm
Now to run some work on another thread you can use a crate like wasm_thread
.
use wasm_thread as thread;
thread::spawn(|| {
for i in 1..3 {
log::info!("hi number {} from the spawned thread {:?}!", i, thread::current().id());
thread::sleep(Duration::from_millis(1));
}
});
Important: The browser forbids blocking on the main thread, so take care to never call any code on the main thread that will block / wait on another thread. If you absolutely need a workaround you can busy-loop instead, as Rust's memory allocator itself does.
You can also use crates like rayon
to automatically parallelize iterators. For that look to the wasm-bindgen-rayon
crate and then use rayon
like normal.
Note that the blocker on getting async-executor to properly initialize on multithreaded wasm should be resolved with https://github.com/smol-rs/async-executor/pull/108.
Any progress on that?
UPDATE: This is on hold while
TaskPool
andScope
are reworkedMotivation
Currently, Bevy can only run single threaded on WebAssembly. Bevy's architecture was carefully designed to enable maximal parallelism so that it can utilize all cores available on a system. As of about six months ago the stable versions of all browsers have released the web platform features needed to accomplish this (
SharedArrayBuffer
and related CORS security functionality). I think now is a good time to attempt to make Bevy run natively in the browser like it does on the desktop: fully multithreaded.There are three distinct tasks that will enable the accomplishment of this goal:
task_pool::{Scope, TaskPool, TaskPoolBuilder}
that run on wasm calledwasm_task_pool::{Scope, TaskPool, TaskPoolBuilder}
, and use those instead of thesingle_threaded_task_pool::{Scope, TaskPool, TaskPoolBuilder}
(TODO: create issue and link here)NOTE: as outlined below in the "Insights provided by developers who have tried to make things that run multithreaded on wasm" section, if we need to do something where we cannot use wasm-bindgen we will need to manually set the stack pointer in our code because this is one of the things wasm-bindgen does. @kettle11 has put this functionality into a tiny crate: https://github.com/kettle11/wasm_set_stack_pointer
Background on
SharedArrayBuffer
There is a good reason Bevy, and many of the existing projects that run on wasm only run single-threaded. Shortly after the time of the initial introduction of the
SharedArrayBuffer
web API - which would allow true unix-like pthread-style mutlithreading using wasm in the browser - the Spectre exploit was discovered.Due to SharedArrayBuffer being a wrapper around shared memory, it was a particularly large vector for Spectre-style exploitation. In order to maintain their strong sandboxing security model, browsers decided to disable the feature while a proper solution was developed. Unfortunately, this eliminated the necessary functionality to allow true multithreading on wasm. What existed in the interim was a much slower emulation of threads using WebWorker message passing.
Thankfully, as of about six months ago all browsers have re-enabled a redesigned and secure version of SharedArrayBuffer. According to this article on the chrome development blog "Using WebAssembly threads from C, C++ and Rust" https://web.dev/webassembly-threads/, true pthread-style multithreading is now possible on wasm in all browsers, with the small corollary that users may need to write a small specialized javascript stub to get it working exactly in the manner they need. Given that it has been stable for this long, and that some chrome developers have even published a github repository with an implementation for this for rayon using wasm-bindgen, I think now is a good time to investigate how to make Bevy run natively in the browser like it does on the desktop, and try implementing this to see if it will actually work.
Insights provided by developers who have tried to make things that run multithreaded on wasm
@kettle11 provided some good insights into quirks and solutions to multithreaded wasm on discord here on 19 November 2021:
""" In the past I got AudioWorklet based audio working with multithreaded Rust on web. It's certainly possible.
When working with wasm-bindgen it requires some messy code because wasm-bindgen uses the Javascript API TextDecoder which isn't supported on AudioWorklet threads. The way I got around that is by not using wasm-bindgen on the AudioWorklet thread, but that requires a few hacks:
Scanning the Wasm module imports and importing stub functions that do nothing for every Wasm-bindgen import. This is OK because the audio thread can be made to be pretty simple and avoid doing direct wasm-bindgen calls.
Allocating a stack and thread local storage for the worker. wasm-bindgen's entry-point does this normally, but wasm-bindgen's entry point also calls the main function which we don't want for the AudioWorklet thread. So we need to use our own entry point and manually set up the stack / thread local storage.
I opened a wasm-bindgen issue about theTextDecoder thing about a year ago: https://github.com/rustwasm/wasm-bindgen/issues/2367
Also wasm-bindgen solves the "how do we set the stack pointer?" issue by preprocessing the Wasm binary and inserting the stack allocation code, but I found a way to do it without that which I put together into a tiny crate: https://github.com/kettle11/wasm_set_stack_pointer """
Resources