Future of wasm2js - Githubissues

kripken commented 5 years ago

I'd like to do some work on wasm2js. Specifically, the use case I have in mind is to integrate it with Emscripten + the LLVM wasm backend. Then we can use that, instead of the current fastcomp asm.js backend, as a solution for emitting non-wasm output. This would have several advantages:

Better code: Can benefit from LLVM backend opts, newer LLVM IR opts (since upstream is more up to date), Binaryen opts, and Emscripten wasm-specific opts (metadce). Will still be able to benefit from Emscripten asm.js opts, since those are in passes that run after the backend.
Faster build times: can optionally use wasm object files for fast linking.
No more separate bitcode or object files for two different targets.
Get rid of fastcomp and all the support code for it.

This doesn't need to emit valid asm.js since in practice almost all browsers with asm.js AOT also have wasm anyhow (in fact chrome shipped wasm before asm.js AOT; and firefox did have some releases with just asm.js, but even LTS has had wasm for a while now). So wasm2js is an option here.

Specific work I'd like to do:

Integrate wasm2js with emscripten.
Benchmarking and performance tuning. Should be no slower than current fastcomp asm.js.
Testing (emscripten test suite, almost all features should be supported) and fuzzing.

This may involve changes to the JS emitted by wasm2js, so I wanted to ask how much current wasm2js users care about the form of the output? I know the Rust people have been using wasm2js, but I heard recently they have plans to write something new (which made me sad to hear, but on the other hand fewer users may mean more flexibility in terms of how we evolve wasm2js). cc @fitzgen

dschuff commented 5 years ago

+cc @juj

kripken commented 5 years ago

Some previous relevant discussion: https://github.com/emscripten-core/emscripten/issues/8085

kripken commented 5 years ago

ccing wasm2js/wasm2asm authors for more visibility: @yurydelendik @alexcrichton @dcodeIO @froydnj @tlively

dcodeIO commented 5 years ago

Does this imply that a compiler that doesn't use LLVM/Emscripten won't be able to output a JS version (we don't care about valid asm.js, just something JS) with just Binaryen / BinaryenModulePrintAsmjs anymore? That'd be sad.

tlively commented 5 years ago

I strongly support this direction for wasm2js. Replacing Fastcomp was exactly my goal when I worked on this as an intern in summer 2017, with the motivation of simplifying Rust's dependency on Emscripten.

tlively commented 5 years ago

@dcodeIO, if I understand correctly, that will still work but the JS you get may be different that what it is today. In fact the JS will be better because wasm2js will be feature-complete enough to support all emscripten tests.

kripken commented 5 years ago

Yeah, we definitely don't want to remove use cases people care about - @dcodeIO, thanks for mentioning that you use this code path.

How much do you care about the external API of the JS code emitted? I might want to change it in minor ways. (Aside from that, my plan is to just improve the quality of the JS emitted.)

dcodeIO commented 5 years ago

Not really bound to the external API, as long as it can either be run directly or easily postprocessed. A WebAssembly-ish API, similar to how Wasm works in browsers, would be great though :)

kripken commented 5 years ago

Yeah, a wasm-ish API - almost like a polyfill - is what I was thinking too, heh.

alexcrichton commented 5 years ago

This all sounds like a great idea to me! FWIW the idea that we might implement wasm2js in Rust was primarily motivated that its maintainership here seemed to be waning. If it picks up though we're happy to help.

I'd personally agree that asm.js isn't too important at this point for the reasons mentioned, and the only desire we'd have is that wasm2js emits an ES module (as it does today) for inclusion into apps. Eventually we'd like to include this at least as a default option (if not on by default) in pipelines like Webpack.

kripken commented 5 years ago

Thanks @alexcrichton, good to know! Yeah, I intend to make this a major focus for myself personally, and it will definitely be a high priority once emscripten depends on it as a fastcomp replacement.

I'll have to investigate the JS output format issue - for emscripten and AssemblyScript it seems like a "wasm polyfill" approach is better, and for you an ES6 module is. Probably we can implement one in terms of the other or something like that.

bvibber commented 5 years ago

I have a couple concerns about the "wasm polyfill" approach:

Will this work with multiple modules? In ogv.js I load separate modules for each file type, and so may have multiple instances of different demuxers running in the same JS context. They need to not stomp on each other if they each load a polyfill for the WebAssembly namespace.
If the polyfill replaces the global WebAssembly object, other code that's loaded into the web app later may think WebAssembly is available and try to compile and run its own modules, which would presumably fail.

These can probably be resolved by allowing the polyfill to use a custom namespace, and maybe also emitting it as a separate file which can be loaded once.

Note also for my case, JS output mostly targets IE 11 and old versions of Safari and Edge, so I need to avoid ES6 modules.

alexcrichton commented 5 years ago

@kripken that sounds great! And yeah definitely agreed that the output format isn't too too important in that we can translate one way or the other as necessary. I think the polyfill approach is probably more flexible because it's how wasm is always used at the fundamental level today!

kripken commented 5 years ago

Thanks @brion, very good points. Yeah, the "polyfill" approach does want to affect the global scope. So it seems like building the polyfill as an extra optional layer on top of the other approach is the better way to do.

kripken commented 5 years ago

I'm starting on this work now. Anyone interested to review the patches?

kripken commented 5 years ago

Looks like the current output is close to an ES6 module (import, export, etc.). For a JS fallback though, we can't assume the VM is new enough to have ES6 module support (+ if it does, it anyhow likely has wasm anyway)? Can I de-ES6 that, or am I missing something?

dcodeIO commented 5 years ago

According to these docs ES6 modules are still experimental in node, so I guess de-ES6-ing is reasonable.

kripken commented 5 years ago

@alexcrichton it looks like you added the ES6 module output stuff for wasm2js - do you remember why? I'd like to replace it for the reasons 2 comments back.

kripken commented 5 years ago

Are people using the C API call BinaryenModulePrintAsmjs? (@dcodeIO?)

dcodeIO commented 5 years ago

Yes, that's behind --asmjsFile, -a currently. I don't know of anyone relying on it, though, except some of our own tests / the n-body benchmark for comparison.

kripken commented 5 years ago

Thanks @dcodeIO - I'll keep it working then, shouldn't be a problem.

dcodeIO commented 5 years ago

I'm totally fine with renaming it or otherwise changing the API ofc, as long as it's still there :)

alexcrichton commented 5 years ago

IIRC the ES6 output was added to align with the esm-integration proposal which defines how to view a wasm module as an ES module. I also figured it's really the only common unit of compatibility in the JS ecosystem, where if ES6 isn't used it's some invented module format which ES6 can compile down to.

Basically it was an attempt at being forward compatible with tooling, while also acknowledging that the output only really works in bundlers today and would require some form of external tooling to process it to be compatible with Node.js

kripken commented 5 years ago

I see, thanks @alexcrichton. Ok, if this is needed for bundlers then I guess we should keep it around. I'll add an option to emit another variant of the glue (will be easier to do that after my current refactoring).

kripken commented 5 years ago

Ok, I'm practically done with correctness here - wasm2js passes the emscripten test suite at all opt levels, and the fuzzer didn't find anything overnight. Looking at optimizations now.

kripken commented 5 years ago

Ok, I'm basically done with wasm2js. It passes almost all tests (see exceptions below), and looks good on code size and perf - it's actually nicely smaller than emscripten's asm.js output in many cases!

Unhandled issues, that may be done as followups if there is need:

Massive switches lead to massively-nested blocks, which JS engines end up hitting parsing limits on. It's quite hard to optimize this, I did some work to pattern-match switches, but there are many variations. I'm not sure how common this is in real-world code, since the wasm backend does break up huge switches if it thinks it should (I only see breakage in the artificial test_bigswitch/test_biggerswitch tests).
Function name mangling: the wasm backend will lower the _ZN* mangling into human-readable names, but those then get mangled into JS, which makes them unreadable again (but in a different way). Options here might be to an option to stop the backend from doing this, or to actually implement a parser from the human-readable form here in binaryen into the _ZN* form (which sounds... bad).
No source map support. For debugging, wasm should be mostly ok, as this is just on dev machines.

CryZe commented 5 years ago

It looks like wasm2js now generates 114 MiB instead of 155 MiB of JS for my 2 MiB wasm file. If I run uglifyjs on it, it gets minified down to 12 MiB. That's still kind of unfortunate compared to the 4 MiB JS file that got emitted by emscripten. It's quite an improvement over earlier versions of wasm2js / wasm2asm though. However I was also not able to run uglifyjs with either --compress or --mangle as it completely crashes then on a stack overflow for the file.

I think most of the remaining unoptimized JS code is the fact that pretty long variable names are used. Maybe there is an option in wasm2js that I missed?

kripken commented 5 years ago

@CryZe wasm2js does accept optimization flags like other tools, so it's important to run it with something like -O3 or -Os, that can make a huge difference.

Aside from that, it's still good to run a normal JS minifer on it, emscripten uses its own, and optionally closure: https://github.com/emscripten-core/emscripten/blob/incoming/tools/shared.py#L2644 (Both of those minifers can scale up to massively large amounts of JS.)

Are you just running wasm2js directly yourself, and not using it from emscripten? Maybe we should improve the docs for that?

hummeleBop commented 5 years ago

I think asm.js is still relevant, there are several uses case. The KaiOS platform is only able to run asm.js optimizations and cannot compile/run WebAssembly bytecode. Also, it's a low-spec platform and asm.js would be useful to enhance the user experience. Wasm2js should be able to produce asm.js compliant javascript from the MVP webassembly.

kripken commented 5 years ago

@hummeleBop it would be good to know more about KaiOS's status and plans, specifically when they intend to upgrade their JS VM. If you know, or you know someone that does, that would be very useful!

ibaryshnikov commented 5 years ago

@kripken some feedback about wasm2js in our app, tested in chrome

Speed 155ms handwritten js 120ms wasm2js latest (c7e9271, version_87)) 120ms wasm2js latest, -O3 185ms wasm2js before (d8bcf64, v1.38.29)

Code size 346k latest 229k latest, -O3 517k v1.38.29

Details webpack dev build (release build is somehow complicated, but I can send it later if there's a need) rustc 1.36.0 wasm-bindgen 0.2.48 chrome 75.0.3770.100 wasm-pack 0.8.1

Additionally wasm2js v1.38.29 prints error Unknown option '-O3' as well as Switching to "almost asm" mode, reason: grow_memory op Pure wasm time is 55ms (note that js in dev mode and wasm in release mode) If I use dynamic import to load the tested function, it seems slower (again, it may be related to a dev build, need to do a release one)

kripken commented 5 years ago

Interesting, thanks @ibaryshnikov! Overall the results look good I think.

Is anything used to minify the JS after wasm2js? A standard minifier like terser can improve it a lot (wasm2js doesn't focus on simple stuff normal minifiers do anyhow).

I think older wasm2js didn't have optimization flags yet, which is why there is Unknown option '-O3' there.

ibaryshnikov commented 5 years ago

@kripken without additional tools, just wasm2js. They'll go to webpack and got minified later

WebAssembly / binaryen

Future of wasm2js #1929