WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.52k stars 745 forks source link

Is it possible avoid c++ exeptions? #2917

Open MaxGraey opened 4 years ago

MaxGraey commented 4 years ago

I mean build binaryen with emscripten with "-fno-exceptions" for asmjs / wasm. But I guess other build targets without exceptions will be also benefits for it's size. As I know LLVM (but use noexcept hints) and GCC avoid exceptions at all and it's make sense especially for binaryen.js and binaryen.wasm.

So I'm wondering how hard refactor binaryen codebase for exclude exceptions or at least add noexcept hints if it's totally impossible.

kripken commented 4 years ago

I've experimented with this,

https://github.com/WebAssembly/binaryen/commit/8f62072fe1ae743ae607521120f2be9d5d30c712#diff-633311eb1c9f14a0358c77ab7ed93b97

It may be worth it as an option to use setjmp over exceptions for precompute, and aside from that, we just use exceptions for fatal errors, which don't need to be handled. So it might be worth doing.

On the other hand, native wasm exceptions support will arrive which will make this less useful (but still, code size may prefer setjmp).

MaxGraey commented 4 years ago

Hmm interesting. Also as I understand exceptions used pretty rare and mostly for tools/* and parsing. Also it pretty trivial and I guess could be easily replaced by std::optional or std::expected based on expected monad. long jumping it's how usually emulate exceptions in C right? But it's not help for js / wasm binaryen's targets

kripken commented 4 years ago

Yes, those are options. Another option is to rewrite the interpreter to be stack-based, which is how wasm-traversal works.

MaxGraey commented 4 years ago

I could create PR which refactor exceptions to optional / expected but rewriting to stack-based interpreter is too complicated for me as newcomer contributor) But I think the stack interpreter would be preferable?

kripken commented 4 years ago

A stack interpreter might be better, yeah. It could basically use wasm-traversal. But it would be more code, which is a downside.

Is this urgent for some reason?

MaxGraey commented 4 years ago

No, absolutely not urgently. Just considering options to further reduce size of binaryen.js

MaxDesiatov commented 4 years ago

I'm interested in this as I'm trying to build binaryen for Wasm without emscripten. The reason is that SwiftWasm toolchains operate without emscripten at all. We could call into prebuilt binaryen.js through our JavaScript bridge, but that would limit its usability only to browsers and Node.js, excluding other Wasm hosts w/o JavaScript support. Ideally we want to link to binaryen C API from Swift directly when targeting Wasm. This is something I can already do in my fork of binaryen when targeting non-Wasm platforms.

Currently, when building binaryen with upstream LLVM/clang (which are shipped with SwiftWasm) for Wasm, I get cannot use 'throw' with exceptions disabled. Is refactoring exceptions to optional/expected still something that could be considered? Or is there some other approach to building binaryen with upstream clang that I missed?

kripken commented 4 years ago

@MaxDesiatov

I'm curious to understand your use case: is this to run on the developer's machine, or to ship with the code? And if on the developer's machine, why not build binaryen normally to windows, linux, mac, etc., the way the emsdk and wasm-pack do it? Or is this to run SwiftWasm on the Web? (but if so, then binaryen.js would be ok)

If you want a pure wasm build, without JS (and without needing emscripten to generate it), that will only be possible with wasm exceptions support eventually. We can get close by replacing exceptions with longjmp (see commit linked to earlier), but that would still need JS to unwind - plain upstream clang won't compile it properly.

In theory a "lower invokes" pass could be written, that emulates the behavior of invokes. That would be similar in effect to the monad approach @MaxGraey mentioned (every call site receives both a value and "is an exception thrown"). It's possible if someone is interested - and the code would be useful in the future as a "polyfill" for wasm exceptions - so I'd welcome a PR there.

Another option is wasm2c, which would emit a single C file that builds on all platforms. However, you would still need emscripten to generate that C file.

dcodeIO commented 4 years ago

I'd also be interested in a Wasm-only build, even if it's limited but otherwise works. With that the AS compiler (along Binaryen) could run on let's say WasmTime :)

MaxDesiatov commented 4 years ago

I'm curious to understand your use case: is this to run on the developer's machine, or to ship with the code? And if on the developer's machine, why not build binaryen normally to windows, linux, mac, etc., the way the emsdk and wasm-pack do it? Or is this to run SwiftWasm on the Web? (but if so, then binaryen.js would be ok)

This is to run Swift apps (including SwiftWasm itself eventually) linked to binaryen on any Wasm host, either browsers or any other non-JS host such as Cloudflare workers.

Binaryen.js is not suitable for us as it makes the API cumbersome and loses all type information already available in the C header we get for free (thanks to how tightly Swift can integrate with C). And it obviously excludes all Wasm hosts that don't have JavaScript support.

MaxDesiatov commented 4 years ago

@kripken

In theory a "lower invokes" pass could be written, that emulates the behavior of invokes. That would be similar in effect to the monad approach @MaxGraey mentioned (every call site receives both a value and "is an exception thrown"). It's possible if someone is interested - and the code would be useful in the future as a "polyfill" for wasm exceptions - so I'd welcome a PR there.

Can you elaborate on this please? What exactly do you mean by "invokes" here? And what would be the source of lowering then? Would it imply that upstream clang gains Wasm exceptions support first and then a potential future binaryen pass could lower that for MVP Wasm hosts?

kripken commented 4 years ago

This is to run Swift apps (including SwiftWasm itself eventually) linked to binaryen on any Wasm host, either browsers or any other non-JS host such as Cloudflare workers.

Oh, so it's to allow the entire compiler to be easily runnable on wasm hosts, etc.? Cool idea!

To elaborate on the invoke idea: JS is used to handle C++ exceptions (and longjmp) in a single very simple way: to unwind the stack. How it does that is that instead of foo calling bar, it calls an "invoke" function that calls bar for it. The "invoke" function returns two things: the normal return value of bar, and whether an exception was thrown (in which case the normal return value is not relevant).

The JS support for this is pretty trivial, an invoke is a JS function that does a try-catch around a call to the wasm Table. In summary:

// JS
function invoke(ptr, arg) {
  try {
    table.get(ptr).call(arg);
  } catch (e) {
    setExceptionFlag(e.ptr);
  }
}
// Wasm, written as C++
void foo() {
  bar(5);
}

=>

void foo() {
  invoke(17 /* function pointer to bar */, 5);
  // also check exception flag, if we need to do something with that
}

This could be lowered into pure wasm. Each call would get two values, the normal return value, and whether we are unwinding. If we are unwinding, we'd immediately return out of the function (continuing the unwind). An invoke would be a place where unwinding can stop.

This would increase code size and add overhead, but a whole-program analysis could remove unnecessary checks for unwinding in places we know an exception is not thrown (the Asyncify pass does that, for example).

With that in place, upstream clang should be enough to compile binaryen into pure wasm. You will however need to run clang's built-in support for emscripten-style exceptions (that's what emits invokes), and then binaryen's pass to lower exceptions. And you'd need to link the compiled C++ code for libc++abi, of course.

MaxGraey commented 4 years ago

Also it could significantly speedup interpreter: https://pspdfkit.com/blog/2020/performance-overhead-of-exceptions-in-cpp/

kripken commented 3 years ago

Re-reading this now (after seeing #3722), an option not mentioned is to add an option for a binaryen build without error handling. That is, no exceptions would be thrown on a validation error, instead we would just abort(). That would be simple to do, and maybe good enough for toolchains that know they are processing valid inputs.

syrusakbary commented 3 years ago

That would be a great step @kripken!

kripken commented 3 years ago

Actually - wasm exceptions now fully work in at least LLVM, v8, and binaryen. And maybe other VMs too? And @dschuff has verified recently that binaryen compiled with wasm exceptions passes the test suite. So the most straightforward thing is to just build it that way to get a pure wasm build (and that would include full error handling).

I wouldn't be opposed to a PR to support a build with exceptions disabled, however, if that helps things meanwhile before wasm exceptions are everywhere. The simplest thing would be to modify CMakeLists.txt to disable exceptions, but that would mean if an error happens the result will be a JS exception is thrown with no explanation. Some ifdefing would be better to replace the throwing of exceptions in that code path with a Fatal() << "Cannot throw exception with message: " << e.msg() or such, so at least something is printed.

tlively commented 3 years ago

I wouldn't want anyone who's not very closely collaborating with the EH standardization effort to be depending on Wasm EH in production until the spec advances further.

syrusakbary commented 3 years ago

I wouldn't be opposed to a PR to support a build with exceptions disabled, however, if that helps things meanwhile before wasm exceptions are everywhere. The simplest thing would be to modify CMakeLists.txt to disable exceptions, but that would mean if an error happens the result will be a JS exception is thrown with no explanation. Some ifdefing would be better to replace the throwing of exceptions in that code path with a Fatal() << "Cannot throw exception with message: " << e.msg() or such, so at least something is printed.

That sounds good. @dcodeIO can you confirm this path would also work for AssemblyScript? (as a temporary step before Wasm EH)

I wouldn't want anyone who's not very closely collaborating with the EH standardization effort to be depending on Wasm EH in production until the spec advances further.

That was my initial impression, in any case it might be good for Wasmer start prototyping exceptions if they are getting more stable now. Do you think that's a fair assumption @tlively? Or do you think Wasm EH will evolve significantly from the current proposal/opcodes? (I'm trying to compute the cumulative effort/gains of implementing now + iterating later VS implementing later)

dcodeIO commented 3 years ago

Thanks for the ping! Iirc, one blocker could be exceptions thrown in the interpreter, which do not only indicate a hard error a release build could just expect never to happen, but also that an expression cannot be interpreted for "normal" reasons. For instance, asc depends on "running expressions" quite a lot to evaluate what's constant and what's not (exception means it's not), so if I'm not mistaken here, then there might be some refactoring necessary. Dangerous half-knowledge, though.

MaxGraey commented 3 years ago

yes, it seems you're right: 1) https://github.com/WebAssembly/binaryen/blob/ffac06650507ac413d60d72aadc1e33fb1f91ccf/src/wasm-interpreter.h#L3045

2) https://github.com/WebAssembly/binaryen/blob/2488b523216600b4de2fe1e33ad695b337f8b9f8/src/passes/Precompute.cpp#L232

kripken commented 3 years ago

Good points @dcodeIO @MaxGraey , I forgot that pass...

We could disable it in a no-exceptions build. That would mean the optimizer is less powerful, but the difference would be very small (looks like 0.3% on the AssemblyScript n-body benchmark in the test suite here).

edit: sorry, I mixed up the math, the initial number was too large by a factor of 10

MaxGraey commented 3 years ago

precompute pass is quite important. I guess we should look at expected monad approach. Same approach uses in Haskell, Go and Rust.

kripken commented 3 years ago

A monad approach can work as discussed above. But it would be a large amount of work. Given that wasm exceptions are getting close (though maybe not as close as I'd thought), and that precompute is not the most important pass, I think it's probably not worth doing.

See my edit: precompute is just 0.3% of code size on that benchmark. Testing on emscripten benchmarks I see similar numbers, all less than 1%. It is true that some benchmarks may end up affected more significantly, but not many I expect.

MaxGraey commented 3 years ago

Monad approach may also significantly increase performance: https://github.com/WebAssembly/binaryen/issues/2917#issuecomment-715985374

kripken commented 3 years ago

Fair point @MaxGraey

If someone has time to do it, I'd welcome a PR. I'd strongly recommend going down the "polyfill wasm exceptions" route, though, as discussed above. That is, lower wasm exceptions into wasm MVP code using the monad pattern. That way the code will still be useful in the long run. If someone is interested and has questions about implementing such a pass let me know.

dcodeIO commented 3 years ago

Just a quick note: I think if we'd disable the precompute pass, asc would still not work because it depends on ExpressionRunner so much, which uses the same underlying infrastructure iirc, to evaluate conditions etc. at compile time. For example, lots of stdlib wouldn't compile without static type checks. So I'm thinking: Would it also be an option to refactor the interpreter a bit, so it doesn't need to throw exceptions? If I'm not mistaken, such a refactor had been suggested for other reasons a while ago, but I do not remember why that was (perhaps performance, or code style?). Sorry if this was already suggested, I may have misunderstood then :)

kripken commented 3 years ago

Oh, interesting. Can you disable those evaluations? (are they optimizations, or necessary even in debug builds?)

Yes, the interpreter could be rewritten to be stack-based, see earlier discussion higher up: https://github.com/WebAssembly/binaryen/issues/2917#issuecomment-647763524 That's not a small amount of work, but it is straightforward.

dcodeIO commented 3 years ago

It is necessary, sadly. In AS, we do not have #ifdef for example, but instead have a mechanism to do things like if (isString<T>() && someLocalThatIsActuallyConstant) { (or arbitrarily complex), for which we use ExpressionRunner to determine which branch to compile, at compile time even when not optimizing, since the other branch would be invalid and must be ignored. But yeah, if that's a significant amount of work, then, hmm.

kripken commented 3 years ago

It's not trivial, but not huge. I'd guess maybe a day or two of work.

Has this become urgent for some reason?

tlively commented 3 years ago

I wouldn't want anyone who's not very closely collaborating with the EH standardization effort to be depending on Wasm EH in production until the spec advances further.

That was my initial impression, in any case it might be good for Wasmer start prototyping exceptions if they are getting more stable now. Do you think that's a fair assumption @tlively? Or do you think Wasm EH will evolve significantly from the current proposal/opcodes? (I'm trying to compute the cumulative effort/gains of implementing now + iterating later VS implementing later)

If you're interested in providing feedback to the standardization effort (even "this works fine"), it would be a good time to start prototyping. The spec proposal has settled into a stable state for now to allow it to be implemented and evaluated end-to-end, but I can't promise that it won't change again as a result of that evaluation. If you want to be sure you'll only have to implement it once and not make significant changes later, it would probably be good to wait at least until it reaches phase 3.


And while I'm here, I will just mention that I would welcome a stack-based rewrite of the interpreter for entirely separate reasons; it would make it simpler to run the interpreter over Poppy IR.

dcodeIO commented 3 years ago

Has this become urgent for some reason?

I mostly appreciate the interest expressed around here since I'd love to make good use of a Wasm-only build as well. I also know quite a few people who have expressed to me that they are enthusiastic about the possibility, i.e. to run asc off the Web, so I was eager to provide the answers I can contribute :). Didn't want to unnecessarily push this, though.

kripken commented 3 years ago

I see, thanks @dcodeIO , makes sense. I'm also interested in this direction, but would not have time to work on it myself due to GC and other things.

Meanwhile I see we have a TODO to remove NonconstantException,

https://github.com/WebAssembly/binaryen/blob/9c1d69f6596b76fe83bff17709b92f8cc2054a31/src/wasm-interpreter.h#L1824-L1825

That would mean refactoring the interpreter to check if (flow.breaking()) in more places, basically. We already follow that pattern for almost everything, except for trapping, so this would emit a breaking flow Flow(NONCONSTANT_FLOW) instead of an exception, for a trap. That's a smaller refactoring than switching it all to a stack machine, and would also remove exceptions from the interpreter.

(Except for wasm exceptions instructions themselves, Try/Throw/etc.. But I assume there's no need for a wasm build that can compile wasm exceptions if wasm exceptions are not allowed for that build itself...)