WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.3k stars 717 forks source link

0xc block signature fallout #758

Open kripken opened 7 years ago

kripken commented 7 years ago

0xc has block signatures, which lead to some problems for binaryen, namely

So we must walk a fine line here, DCE but not DCE too much - that's the current status. I'm not sure yet how much of the test suite it passes, but looks like quite a lot, so maybe this is viable for now.

But to handle this more properly, my inclination is to add this to the growing list of divergences between Binaryen IR and wasm. We need to be able to do simple local transformations like that example. And it's good for us that type checking is local/encapsulated, since we intentionally have expressions stand on their own so that we can generate and optimize them in parallel - they don't need a surrounding context in order to be made sense of. But wasm needs this, so to emit valid wasm we'd need to handle this when emitting or reading the binary. But otherwise we'd just document this as the third significant divergence from wasm.

kripken commented 7 years ago

Note that the test suite coverage I mentioned was using spidermonkey. I just realized that since spidermonkey checks fewer errors than v8, the picture here might change once we can check on v8 as well, as the errors discussed above are exactly the kind on which the engines diverge.

dschuff commented 7 years ago

If we want to make binaryen's type semantics/rules different from wasm proper, then we should probably implement a strict wasm type checker and run it at whatever point we convert for output. Of course there are also still wasm-only tools in wabt; I believe wabt's type checker is strict (and if not then we should make it that way).

binji commented 7 years ago

@kripken: Maybe this is not so dire. You can always do the transformation, you just need to fix up the block signature later depending on its context. And even then, that's only true if you don't know the concrete type of X (i.e. if it is unreachable).

@dschuff: Yes, wabt is strict.

kripken commented 7 years ago

@binji: Yeah, question is when the fixup can be done. If it needs to be done after every operation that sounds bad, so one option is it's ok in binaryen IR to not care about it, and handling it when loading/saving wasm.

But then: @dschuff, I think that's a good question - do we allow loading and saving IR that isn't true wasm. This wasn't much of an issue before since we diverged by being a subset of wasm, but this divergence would make us a superset.

Not being able to save and load our IR - which a strict wasm-only checker would cause - sounds bad. So perhaps we should go ahead and do the full split from wasm, creating .byn files that differ from wasm files, etc., as suggested in #663. Maybe we've held off as long as we could on doing that?

ghost commented 7 years ago

It's an interesting discussion. Still seem to be some matters to settle. Might it help to note why a block end fall-through might be unreachable, that this can not be worked around, with an example:

(block $l0
  (block $l1
    (br_if $l1 ...)
    (br $l0)
    )
  ...
  )

Then if the block fall through needs to be type validated then it seems to create some challenges.

@kripken there seem to be a lot of issues with the unreachable code and have you given any thought to the burden on binaryen of implementing the proposal in https://github.com/WebAssembly/design/issues/778

ghost commented 7 years ago

No use case for supporting the transform (X) => (block (Y) (X)) where the block label is not used comes to mind? It appears that in such cases the block is unnecessary and can be flattened into the parent block. If there are some examples then they might be interesting to consider?

Perhaps binaryen could add an internal block type for a block for which the tail is unreachable, and just ensure that it is flattened before emitting the wasm. If there is a real need for this then perhaps this needs a type in wasm too.

kripken commented 7 years ago

@JSStats: the main use case for that transform is when you need to add some code that runs first. you might not know if the parent is another block, and it might not be another block anyhow. so creating a new block is a simple and useful thing to do (and later optimizations might get rid of it, if possible).

kripken commented 7 years ago

Well, the current status is that I am not aware of asm2wasm problems due to this - we seem to work around the type system corner cases. Test suite passes and fuzzing looks ok. So I guess for now we can ignore it. But when the type system is formally written up, we should look into this more carefully, as I worry running on non-asm2wasm output may hit a problem. Leaving open for that.