wasm2c roadmap ideas - Githubissues

keithw commented 2 years ago

Now that wasm2c has almost caught up to the current Wasm spec, maybe it's a good time to brainstorm about the roadmap from here and see what everything else thinks is useful/worth prioritizing. Here are some possible items and thoughts to get the discussion going:

[ ] A WASI implementation for Unix-ish hosts (PR #2002) -- will be awesome to have this in
[ ] Some sort of continuous performance regression testing. Basically an operationalized/ongoing version of the benchmarks at https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html
[ ] Increased safety when modules link with other modules. Right now it's pretty easy to create a segfault (or worse) by having one module import something by name, and another module exports something with the same name but an incompatible type, because right now wasm2c makes "optimistic" extern declarations for everything that's imported. We could improve this by making the wasm2c-generated code actually include the header from the imported-from module (so the C compiler will enforce type correctness), instead of making an optimistic declaration in its own header (#1908), or by reintroducing some sort of type mangling.
[x] SIMD support (PR #2119). This is the last remaining piece to get full conformance with WebAssembly 2.0, and it would be kind of cool to say we conform to the whole spec before it's finalized. I think I've done most of the scaffolding work, but still need to implement all those instructions, ideally in a way that spits out generated code that's both (a) inlined to SSE2 intrinsics when available, but also (b) with backup implementations in pure C for everywhere else, and then (c) ideally we'd test both backends in the CI.
[x] Get Mozilla using the upstream wasm2c as part of the Firefox build process. Would probably be great to have a production "customer"; I know they had wanted bulk memory support (which we now have) but I think they probably also depend on a bunch of features in the UCSD/rlbox fork. We could work with Mozilla/UCSD to get them transitioned over to the main branch.
[x] wasm2c "one .c file per function" mode (PR #2146). It takes a huge amount of time and memory for gcc/clang to compile the output of wasm2c for a big program (especially with optimization) because it's just one gigantic C file. But the structure of wasm2c's output is so well-structured that it would be trivial to split it up into a single .c file per function (each importing the same .h file as currently). This is probably much more parallelism than is even in the original program. And if the function names remain stable, then with a memoized/hashing build system, it would be possible to change or insert just one function in a gigantic program and 99% of the work of compiling the wasm2c output could be memoized. This would be super-cool. (You might worry about losing opportunities to inline, for which I think the best answers would be (a) we should do the continuous performing monitoring of above, (b) you don't have to use this option, (c) LTO, or (d) hopefully the good inlining opportunities were already taken upstream by the optimizing compiler that produced the .wasm file in the first place.)
[x] Speeding up wasm2c itself on large programs (PR #2171 for a big chunk of this). One approach might be to move away from using BinaryReaderIR (which manifests the entire program in RAM) and create a custom BinaryReaderWasm2C that could process the file in one pass in a streaming manner, just like we have BinaryReaderInterp already. This may be a bit risky because (a) now we have to make sure we hook into the validator everywhere we need to or else badness, and (b) I don't know how much the performance improvement would be, but I think w2c2 suggests this might be a fruitful route.
[ ] Better fuzzing. I don't think we're fuzzing wasm2c right now at all. It would be nice to have a fuzz target in OSS-fuzz, ideally one that not only checks for safety violations but also tries to find disagreements between wasm2c's output (when compiled and run) vs. the WABT interpreter.
[ ] Selectable behavior on a per-memory basis about whether OOB is hardware-checked or software-checked. For the main memory of a long-running program, it's much faster to use mprotect and the signal handler to detect OOB on a memory. But for memories that are attached transiently to some random region of memory (to give zero-copy access to a binary blob) and then detached, it's a lot of overhead to have to set up these 8 GiB mmapped/mprotected regions for everything you might want to ever point to. It would be nice to be able to tell CWriter which memories in a module should have explicit (software) OOB checks on load/store and which should rely on the MMU and signal handler. This could be done by... (a) adding a field to the Memory structure in the IR (most convenient, but kind of icky since it's really a wasm2c-specific annotation), or (b) some sort of condition that depends on the debug name of the memory, or (c) maybe something in the WriteCOptions that allows the caller to indicate its preference on a per-memory basis. We're using "a" in our private branch, but if this is more general interest, happy to find consensus on the best approach in general.
[x] (Added 11/8): Ability to "de-init" an individual module and remove its func types from the runtime, leaving other modules intact? (Effectively done a different way in #2120.)
[ ] (Added 4/5) Add a method to "reinstantiate" ("reset"?) a module instance without having to free and then instantiate a new one.
[ ] Start work on wasm2rust. Not totally serious, but it would be cool if this existed.

sbc100 commented 2 years ago

Thanks Keith! Those all sounds like reasonable things (except maybe the last one :)

Oh, I didn't know about w2c2. Link for reference: https://github.com/turbolent/w2c2

Regarding splitting output into multiple C files. I agree that could be a good idea, but I think one file per function might be a little too much. I imagine the compilation time of each source file is very much linear in the number of lines (wasm2c output is fairly uncomplicated). Perhaps we could have some kind of splitting threshold such as: start a new file after N lines?

kripken commented 2 years ago

Great ideas!

Regarding fuzzing, I fuzzed wasm2c a while ago using the binaryen fuzzer,

https://github.com/WebAssembly/binaryen/blob/5449744d79ec996c7334681ac1b85e5461194dc8/scripts/fuzz_opt.py#L714-L756

The general idea is that fuzzer emits random valid wasm files, and instructions for how to run them in various modes. That linked code runs wasm2c with the proper shim to run it (emitted by --emit-wasm2c-wrapper) and get output that it can then compare to running the wasm in other ways (like in a wasm VM normally). Then it just diffs the output and sees if any return values or loggings are not identical.

This found a few bugs back then (edit: all of which have long been fixed), but I haven't kept it up to date recently. That would be great to do though.

sbc100 commented 2 years ago

There are also existing fuzzers in for parts wabt at least (e.g. parsers and validations). See https://github.com/google/oss-fuzz/tree/master/projects/wabt

keithw commented 2 years ago

Regarding splitting output into multiple C files. I agree that could be a good idea, but I think one file per function might be a little too much. I imagine the compilation time of each source file is very much linear in the number of lines (wasm2c output is fairly uncomplicated). Perhaps we could have some kind of splitting threshold such as: start a new file after N lines?

Agreed this would be better. I'm trying to think of a good way to do the partition that lets most files stay unchanged when only some functions are inserted/removed/modified. (To allow a memoized build to use its cache for 99% of the files.) You wouldn't want the act of adding one function to end up repacking all the .c files and therefore needing to recompile every one...

sbc100 commented 2 years ago

Regarding splitting output into multiple C files. I agree that could be a good idea, but I think one file per function might be a little too much. I imagine the compilation time of each source file is very much linear in the number of lines (wasm2c output is fairly uncomplicated). Perhaps we could have some kind of splitting threshold such as: start a new file after N lines?

Agreed this would be better. I'm trying to think of a good way to do the partition that lets most files stay unchanged when only some functions are inserted/removed/modified. (To allow a memoized build to use its cache for 99% of the files.) You wouldn't want the act of adding one function to end up repacking all the .c files and therefore needing to recompile every one...

Would a good-enough solution would be to just pack them alphabetically into N buckets (ignoring size)?

Then if you change any one function only that one file would change, adding or removing a function would effect N / 2 files. One downside is that several large function could end up in the same bucket.. but its seems like a reasonable first step. We could make it an option with N == -1 meaning one file per bucket.. so folks could experiment.

deian commented 1 year ago

+1 This is a great list!

start work on wasm2rust. Not totally serious, but it would be cool if this existed

It exists (and we used it for some researchy things)! https://github.com/secure-foundations/rWasm

kripken commented 1 year ago

There is also wasm-to-rust though it seems inactive now.

I've been thinking that a higher-level target might also make sense as wasm itself goes in that direction, specifically regarding GC. Wasm to Go/Kotlin/C# etc. could use native objects in the host GC which could have several benefits.

sbc100 commented 1 year ago

There is also wasm-to-rust though it seems inactive now.

I've been thinking that a higher-level target might also make sense as wasm itself goes in that direction, specifically regarding GC. Wasm to Go/Kotlin/C# etc. could use native objects in the host GC which could have several benefits.

And of course the existing wasm2js could use native JS objects (with fixed/frozen prototypes) https://github.com/WebAssembly/binaryen/blob/main/src/tools/wasm2js.cpp

vshymanskyy commented 2 months ago

Currently, the WASM page size is fixed at 64KiB, which is rather expensive in some scenarios.

WebAssembly WG proposed a new feature to handle it nicely: https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md

Please consider implementing this for wasm2c. This would allow running really tiny wasm modules :raised_hands:

The implementation for wasm3 was really simple

WebAssembly / wabt

wasm2c roadmap ideas #2019