Open kripken opened 8 years ago
Note that the test suite coverage I mentioned was using spidermonkey. I just realized that since spidermonkey checks fewer errors than v8, the picture here might change once we can check on v8 as well, as the errors discussed above are exactly the kind on which the engines diverge.
If we want to make binaryen's type semantics/rules different from wasm proper, then we should probably implement a strict wasm type checker and run it at whatever point we convert for output. Of course there are also still wasm-only tools in wabt; I believe wabt's type checker is strict (and if not then we should make it that way).
@kripken: Maybe this is not so dire. You can always do the transformation, you just need to fix up the block signature later depending on its context. And even then, that's only true if you don't know the concrete type of X
(i.e. if it is unreachable).
@dschuff: Yes, wabt is strict.
@binji: Yeah, question is when the fixup can be done. If it needs to be done after every operation that sounds bad, so one option is it's ok in binaryen IR to not care about it, and handling it when loading/saving wasm.
But then: @dschuff, I think that's a good question - do we allow loading and saving IR that isn't true wasm. This wasn't much of an issue before since we diverged by being a subset of wasm, but this divergence would make us a superset.
Not being able to save and load our IR - which a strict wasm-only checker would cause - sounds bad. So perhaps we should go ahead and do the full split from wasm, creating .byn
files that differ from wasm files, etc., as suggested in #663. Maybe we've held off as long as we could on doing that?
It's an interesting discussion. Still seem to be some matters to settle. Might it help to note why a block end fall-through might be unreachable, that this can not be worked around, with an example:
(block $l0
(block $l1
(br_if $l1 ...)
(br $l0)
)
...
)
Then if the block fall through needs to be type validated then it seems to create some challenges.
unreachable
value onto the stack? If the block returns no values then does this need to be dropped?unreachable
values pushed on the stack.@kripken there seem to be a lot of issues with the unreachable code and have you given any thought to the burden on binaryen of implementing the proposal in https://github.com/WebAssembly/design/issues/778
No use case for supporting the transform (X) => (block (Y) (X))
where the block label is not used comes to mind? It appears that in such cases the block is unnecessary and can be flattened into the parent block. If there are some examples then they might be interesting to consider?
Perhaps binaryen could add an internal block type for a block for which the tail is unreachable, and just ensure that it is flattened before emitting the wasm. If there is a real need for this then perhaps this needs a type in wasm too.
@JSStats: the main use case for that transform is when you need to add some code that runs first. you might not know if the parent is another block, and it might not be another block anyhow. so creating a new block is a simple and useful thing to do (and later optimizations might get rid of it, if possible).
Well, the current status is that I am not aware of asm2wasm problems due to this - we seem to work around the type system corner cases. Test suite passes and fuzzing looks ok. So I guess for now we can ignore it. But when the type system is formally written up, we should look into this more carefully, as I worry running on non-asm2wasm output may hit a problem. Leaving open for that.
0xc has block signatures, which lead to some problems for binaryen, namely
(X) => (block (Y) (X))
(replace a node with a block with some code that runs before the old node, then the old node after it). That's no longer valid in 0xc without looking outside, since the type of the block would depend on the scope it is in (see examples in that link). In other words, binaryen assumed we can type blocks in an encapsulated, local way, but wasm changed that. The outside context matters now.So we must walk a fine line here, DCE but not DCE too much - that's the current status. I'm not sure yet how much of the test suite it passes, but looks like quite a lot, so maybe this is viable for now.
But to handle this more properly, my inclination is to add this to the growing list of divergences between Binaryen IR and wasm. We need to be able to do simple local transformations like that example. And it's good for us that type checking is local/encapsulated, since we intentionally have expressions stand on their own so that we can generate and optimize them in parallel - they don't need a surrounding context in order to be made sense of. But wasm needs this, so to emit valid wasm we'd need to handle this when emitting or reading the binary. But otherwise we'd just document this as the third significant divergence from wasm.