WebAssembly / design

WebAssembly Design Documents
http://webassembly.org
Apache License 2.0
11.37k stars 691 forks source link

Proposal: Add type for unreachable states #1379

Open RossTate opened 3 years ago

RossTate commented 3 years ago

Problem

The current design of the unreachable instruction (and br and the like) imposes an additional decidability constraint on WebAssembly: for any instruction sequence, we need to be able to determine if there exists a typing for the sequence.

This problem will grow more difficult as WebAssembly grows (until we make it trivial with this proposal). Even just the small complexity introduced by adding reference types has caused a bug in the reference interpreter (WebAssembly/reference-types#111), imposing yet another layer of complexity on the reference interpreter solely for type-checking unreachable code. This will be made worse by br_on_null in the typed references proposal, which will require adding a "unknown reference type" case.

This decidability constraint seems to already be restricting the design of WebAssembly. For example, here in #1365 it is raised as an issue for adding a dup instruction. This is because (unannotated) dup and pick instructions would require you to reason that the duplicated type is used consistently across multiple instructions, requiring the type-checker to track constraints on type variables and ensure they're satisfiable, all for the sake of just type-checking unreachable code.

Short-Term Solution

A short-term solution is to add an unreachable type (it seems specifically to be a result type) to the typing rules but not to the grammar. Any (syntactically valid) instruction sequence would have type unreachable -> unreachable. Because this changes nothing besides validation and accepts strictly more programs, we could make this change to any proposal spec and it would be backwards compatible.

This solution does not seem to have been discussed in the design rationale. It also seems to be compatible with all the desirable properties provided there except for the following:

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

Long-Term Solution

A long-term solution would extend the grammar to account for an unreachable type. For example, a function could declare its type as [i32 i32] -> unreachable. WebAssembly compilers could even use this information to optimize calls to such functions, i.e. no need to save registers before the call if the call will never return! (Though of course you still have to consider exceptions.) This solution would involve many more design decisions, some of which might be informed by other discussions that are in flight, so it would probably be best to explore this option as part of some larger effort rather than as an isolated proposal.

rossberg commented 3 years ago

This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping. I explicitly called out that this also axes the bottom type. I'm not surprised that we overlooked an implication like this.

Subtyping and the bottom type are currently reintroduced in the func-ref proposal. But it looks like we have to move it back to reftypes. :(

conrad-watt commented 3 years ago

I agree that at least the short-term solution is desirable (I sketched out the formal rules here).

FWIW, the design rationale you linked does explicitly address this:

Other solutions that have been discussed would lose most of the above properties. For example: |...snip...| Allowing but not type-checking unreachable code would break decomposability and requires the spec to provide a syntactic definition of reachability (which is unlikely to match the definition otherwise used by producers).

I think the concerns about the "syntactic definition of reachability" are overstated. To implement a type checker today, exactly the same set of instructions needs to be handled specially because of their polymorphic output types. I'm also not sure if anyone is relying on decompositionality (I could believe reliance on the converse).

The comparison to the JVM at the end of that rationale doesn't appear to be accurate. The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code. (Quote from the latest JVM standard (link))

It is illegal to have code after an unconditional branch without a stack map frame being provided for it.

So the JVM does not have to deal with any polymorphism issues in dead code. In an earlier version of the JVM (I think prior to the introduction of stack maps), return explicitly set the output type to empty (link).

FWIW, another "design smell" surrounding this - when I implemented a verified type checker for Wasm, easily 90% of the effort both in implementation and in correctness proof was concerned purely with handling the dead code case. This isn't precise but it's a rough indicator for the mental overhead this design decision is causing on type checking, even before typed objects are introduced.

EDIT: missed Andreas' response

This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping.

I don't think this type should rely on subtyping. It's the type of the whole stack, not the type of an individual element. To allow dead code to skip validation, all you need are the typing rules

C ⊢ unreachable : t* -> ⊥
<similar for return, br, br_table>

C ⊢ e* : ⊥ -> t*

This should work even if t* can contain object types.

RossTate commented 3 years ago

My impression of the discussion about subtyping for reference types was that it was centered around a (near-)bottom heap type. I only just learned that there was a bottom value type in the proposal as well. As @conrad-watt points out, the suggestion here is a bottom state/result type, which avoids many of the problems of bottom heap/value types while providing a simpler solution to the same problem that the bottom value type seems to have been solving. (It's also simple enough that its rules can be formulated without explicitly adding subtyping.)

FWIW, the design rationale you linked does explicitly address this:

The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.

conrad-watt commented 3 years ago

The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.

In the short-term solution, the "type" is just a formal artefact to express skipping validation. If the "skipping validation" approach had been agreed on at an informal level, it would have almost certainly been formalised in this way. I agree that the approaches diverge when considering the long-term solution.

This, for example, makes it still decomposable.

I don't believe the resulting language would be decomposable in the same way that the current language is. I also don't believe this is a big deal :P

RossTate commented 3 years ago

Agreed. So do we have insights on reasons to not skip validation that have not been addressed above?

conrad-watt commented 3 years ago

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

I'd like to hear implementers' perspectives on how dead code is currently handled (e.g. in streaming compilation), and whether they'd have any concerns if it was no longer validated. The quoted point (from @RossTate's OP) seems very reasonable.

rossberg commented 3 years ago

@conrad-watt:

I think the concerns about the "syntactic definition of reachability" are overstated.

Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.

The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code.

The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.

I don't think this type should rely on subtyping.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable). Either way, it's a separate, more complex extension. We could add that, too, but it seems less bang for the bucks, in the sense that it won't change engines, merely complicate the spec. Depending on the instruction set, it might not even replace the bottom value type (e.g., if you had pick without a stack type annotation).

FWIW, even on stack types, it would still be a form of subtyping (or something largely equivalent).

In the short-term solution, the "type" is just a formal artefact to express skipping validation.

That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like (unreachable) (i32.const 0) (i64.add) is still rejected.

I'm also not sure if anyone is relying on decompositionality

In general, it means that dead code is still type-checked to the extend that it could be executed. As I mentioned elsewhere, that e.g. addressed potential security concerns.

likely by simply not compiling unreachable code (which addresses the attack vector mentioned here)

Among other things, that work-around doesn't apply to interpreters.

conrad-watt commented 3 years ago

Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.

My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce . I don't think there would be the same heated debate again. Deliberately or not, WebAssembly has effectively already adopted a definition of syntactic deadness through its non-deterministic output types.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).

I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a type as output allows remaining instructions in a straight line to be skipped. This also confines the effects of to within the current block, which seems the obvious choice.

Note that with @RossTate's extended proposal, which I'm not arguing for, block types could explicitly declare their output type as .

That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like (unreachable) (i32.const 0) (i64.add) is still rejected.

See above (also, in this proposal, is the type of the whole stack, not of a single stack element - it's not the same as the bottom type from the earlier reference types draft).

Among other things, that work-around doesn't apply to interpreters.

I remember you saying that WebAssembly shouldn't be designed for interpreters! In any case, the code still has to be parsed; a security-conscious interpreter would have options to harden itself. But this all seems like a hypothetical.

EDIT:

The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.

I was a little off with my analogy - I should have focussed on the point that, on the consumer side, the JVM has never had to handle this kind of polymorphic typing of dead code. I can almost believe that forbidding dead code could be annoying for some producers, but do they care about it being validated? Maybe Wasm could have been designed with input type annotations for dead code from the start, but we're now constrained by webcompat.

RossTate commented 3 years ago

My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.

conrad-watt commented 3 years ago

My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.

@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution. I hope he finds my arguments convincing, and I'm very interested to hear from implementers, but we shouldn't put words in his mouth.

RossTate commented 3 years ago

Ah, I see how I misread it. Thanks for correcting me.

fitzgen commented 3 years ago

Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code, often with its interaction with new proposals (e.g. multi-value).

I've heard similar things from people who've hacked on spidermonkey's other wasm validator; hopefully those folks can speak up here as well.

The short term proposal (where we limit its effects to within a block) sounds like a great idea to me.

tlively commented 3 years ago

Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution, so the short-term solution would not require any meaningful changes to most of Binaryen and would allow some nice simplifications in some of Binaryen. It would also help reduce the cognitive overhead of working with Binaryen IR by making its unreachable handling more similar to Wasm's unreachable handling.

binji commented 3 years ago

Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution

Is that true? I was under the impression that binaryen's unreachable was quite a bit more complex than described here.

I addressed this specifically with the additional typing rule C ⊢ e : ⊥ -> t. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.

It took me a bit to realize how this works. For others who were confused like me: the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e), and yield any number of result types (t). This would allow it to match the expected results for that block.

EDIT: actually, ⊥ must be the entire type stack, see Conrad's comment below.

often with its interaction with new proposals (e.g. multi-value).

Agreed, I've found more cases where unreachability is complex in the newer proposals too (e.g. GC) when the annotations are removed. I believe this is essentially the same problem as select.

conrad-watt commented 3 years ago

the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e), and yield any number of result types (t)

Just to be 100% precise, with this proposal, a type stack can be either t* or , so isn't just the top of the type stack, it's the whole thing. We wouldn't have to handle anything like t : ⊥ or ⊥ : t.

EDIT: Just to note again, without @RossTate's extended proposal, wouldn't actually show up in any program type annotations. It's purely a formal contrivance to represent validation being skipped.

Is that true? I was under the impression that binaryen's unreachable was quite a bit more complex than described here.

Agree there are probably mismatches (no worse than currently). For example I think binaryen can label the output of an infinite loop as unreachable independent of its explicit type annotation.

tlively commented 3 years ago

Binaryen's unreachable is similar to the long term solution in that blocks, loops, and other control flow structures can be unreachable. The complexity comes from Binaryen's rules about when they cannot be unreachable, e.g. when they are branch targets. ~So actually an infinite loop cannot be unreachable in Binaryen because it is necessarily a branch target.~ (Edit: this is true for blocks but not loops, so infinite loops do end up being unreachable. Complicated!) But those details aren't so important here.

Instructions in Binaryen that are not control flow structures and not unreachable polymorphic can have unreachable type only when they have unreachable-typed children. So Binaryen's i32.add instruction can have types (i32, i32) -> i32, (unreachable, i32) -> unreachable, (i32, unreachable) -> unreachable, or (unreachable, unreachable) -> unreachable. Under those rules, instructions that take no arguments cannot be unreachable, so for example f32.const only ever has type () -> f32. With this proposal, we would have to allow instructions with no arguments to be unreachable as well, for example letting f32.const have type () -> unreachable. But that's a trivial change that should be mostly mechanical.

conrad-watt commented 3 years ago

So Binaryen's i32.add instruction can have types (i32, i32) -> i32, (unreachable, i32) -> unreachable, (i32, unreachable) -> unreachable, or (unreachable, unreachable) -> unreachable

That sounds a little different, because you're using unreachable as the type of a single stack element, rather than as the type of the whole stack (see my last comment above). In any case, it's not crucial that Binaryen's internals match the Wasm spec types precisely (considering they already don't!).

taralx commented 3 years ago

Given that we have no "unreachable" type, and blocks have to have types, the only places I can think of where someone is type-checking with an unknown is between a branch and an "end", correct? Or did I miss something?

RossTate commented 3 years ago

I believe that's correct (albeit I'm implicitly refining "branch" to unreachable/br/throw and the like).

rossberg commented 3 years ago

@conrad-watt:

My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce ⊥.

You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.

If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).

I addressed this specifically with the additional typing rule C ⊢ e : ⊥ -> t. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.

Note how I specifically said "If you want to keep checking dead code", which this would not be addressing.

I remember you saying that WebAssembly shouldn't be designed for interpreters!

Well, the context probably was that its design doesn't need to be optimised for interpreters, which was true for long (although I think @titzer changed his mind about that recently). But security obviously is a different consideration.

@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution.

Right, I was initially misreading that as the bottom value type solution, which is just a minor tweak to what all engines are implementing already. The bottom result type is a more fundamental and likely more controversial change.

But just to be clear, I'm just conveying the discussions we had. The all-code-must-be-valid argument isn't mine (even though I'm somewhat sympathetic). IIRC, it was several engine folks who pushed it most strongly at the time, but I don't remember the details.

@RossTate's OP:

This [the short-term] solution does not seem to have been discussed in the design rationale.

It is a version of the don't-check-dead-code solution (for one specific definition of dead code), which is discussed in the doc, and was originally discussed and rejected by implementers, see above. Whether that has changed I don't know.

It also seems to be compatible with all the desirable properties provided there except for the following:

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

You are talking about the implementation details of consumers, while the statement you cite explicitly talks about semantics and producers targeting that semantics. (A consumer OTOH has many reasons to care about reachability, but will typically use a completely different definition, e.g., when it's decoding directly into SSA form. As for type-directed compilation, that's a not an issue, it can just propagate bottom to an arbitrary type without otherwise caring.)

rossberg commented 3 years ago

@fitzgen:

Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code

That's not surprising, since this is the path that will be least well-explored by regular tests. The question is what this implies. Arguing against extra safety checks solely on the basis that they are harder to test is a rather problematic proposition.

titzer commented 3 years ago

Having implemented dead code validation in multiple engines and having participated in the history of this unfortunate topic from the beginning, I'll just heave a sigh here first. There were loud voices that called for banning unreachable code and other loud voices going the other way. We went back and forth in multiple iterations, and it seems like it keeps popping up at every new addition to the type system. It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.

The union of all proposals in flight is actually nontrivial to do in a clean, fast way.

Let me just offer this perspective for how I've done this in the past. I'm not proposing any specific to change to the formalism.

Pretty much all fast validators track unreachability as a bit in the control stack--i.e. syntactically reachable or not, for the current control block. A natural way to write this is to hide unreachability behind the "pop" operation. There are actually three pop operations:

After an instruction that ends control flow (e.g. throw, unreachable, return_call), the value stack is emptied back to the start of the current control block and marked unreachable. Any pop operation beyond the top of the current control block would normally be an error, except if the control is marked unreachable, and then it might succeed, depending on the situation above.

There are some wrinkles, and it actually turns out that you cannot hide unreachability behind the pop operation always, particularly because of some missing type annotations in the function references proposal.

There are other special cases.

Overall the lack of type annotations in various places makes the current set of in-flight proposals kind of baroque.

As for interpreters, my position has, as politicians like to say, "evolved". I'd summarize it as: Wasm MVP was not designed for interpreters but it was designed for fast validation and was simple enough that modulo control flow, it can be dealt with in an interpreter fairly OK. To preserve some of the good properties that it does have, we mostly just have to avoid mistakes. We don't need to hyper optimize for interpreters, just avoid egregiously bad choices.

conrad-watt commented 3 years ago

@titzer thanks for this! The unreachable-handling approach you're describing is what gave me confidence that there would be fewer debates over the definition of syntactic deadness this time around.

It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.

The union of all proposals in flight is actually nontrivial to do in a clean, fast way.

At the risk of this comment being referenced in a future post-mortem, I do strongly believe that the block-local unreachable change would significantly simplify the mental model. There may have been a paralyzing number of alternatives when the language was initially designed, but due to current webcompat constraints our conversation can (maybe accidentally) be much more focussed.

with reference to @rossberg...

You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.

I agree that the current type checking conventions are mainly an emergent property of the formal design. It would be good to get perspectives from current engine implementers about appetite for this change. I'm not sure who to tag - maybe @lukewagner and @gahaas?

titzer commented 3 years ago

@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.

IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g. (unreachable) (f32.const 0) (f32.const 0) (i32.add) would pass validation.

conrad-watt commented 3 years ago

@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.

Not just push, but also pop. So there would no longer need to be any special polymorphic handling for dead code.

A revised algorithm could always work with a concrete type stack. When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.

EDIT: it is still necessary to decode and check certain syntactic constraints on the skipped instructions, see (https://github.com/WebAssembly/design/issues/1379#issuecomment-694430267)

IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g. (unreachable) (f32.const 0) (f32.const 0) (i32.add) would pass validation.

Yes, this would happen (although such code would still have to be parsed).

RossTate commented 3 years ago

@titzer And to clarify, it's not the "top of the control stack" that is typed as unreachable; it is the entire control stack that is typed as unreachable.

conrad-watt commented 3 years ago

@RossTate: I think @titzer's control stack is referring to nested control blocks - with one type/value stack associated with each entry on the "control stack". So the top of the control stack is marked unreachable (the innermost block being typed) - which is equivalent to declaring the bottom of the current type stack to be polymorphic/unconstrained.

In the revised algorithm, this unreachable bit on the control stack wouldn't be needed at all (except potentially to implement the "skip-to-end" handling).

titzer commented 3 years ago

When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.

We discussed such an alternative too. That has the downside that facts about immediates (e.g. that indices are in bounds) are not checked. That means you need to have two modes to the validation algorithm; one that skips immediates (i.e. basically a bytecode iterator) and one that does not. This is because you can't find the end bytecode without properly decoding everything in-between.

Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations. Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.

conrad-watt commented 3 years ago

Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations.

The type annotations are only "missing" to the extent that they facilitate the current typing approach. They wouldn't be necessary in the JVM because it never has to deal with polymorphic stacks. Similarly, they wouldn't be necessary in this proposal. So another way of looking at it is that the current approach causes overhead in type annotation.

Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.

There's a complexity trade-off - but I think the balance is heavily on the side of this proposal. You can't encapsulate the current typing logic purely in pop operations because of select, which needs to be special-cased (and the obvious solutions cause the representation of the type stack to become more complicated).

It's possible to factor the "skipping bytecode" logic so that it's clear what the "normal case" for each bytecode is.

titzer commented 3 years ago

The type annotations are only "missing" to the extent that they facilitate the current typing approach.

Well, type annotations are like violence; if they aren't working, it's because you're not using enough of them :)

But seriously, all of the hard cases that make the pop approach tricky have to do with instructions that are polymorphic or have had type annotations removed. And the current answer always seems to be that polymorphic instructions, even only mildly polymorphic instructions, like ref.as_non_null, are essentially not validated when unreachable. That bothers me, but mode-switching the validator implementation bothers me even more--unreachable checks just spread everywhere, or you end up with a skipToEnd() function, and both seem like regressions to me.

RossTate commented 3 years ago

@titzer I'm confused. Are you saying that decoding is type-sensitive?

conrad-watt commented 3 years ago

you end up with a skipToEnd() function, and both seem like regressions to me.

This was how I assumed it would work. As I said above - its cost shouldn't be considered in isolation. In any case, if the implementation is well-factored it can hook into existing decoding logic.

The fact that we're talking about adding annotations to instructions like dup just to type dead code seems like a major design smell. Moreover, we currently have to deal with select, which has already been standardised without an annotation and forces us to handle "polymorphic non-empty type stack" as a special case when typing every future instruction.

titzer commented 3 years ago

@conrad-watt These are good points. I'll grant that skipToEnd() is probably harder to screw up than almost anything else we consider. I'm warming up to this.

rossberg commented 3 years ago

@titzer:

We discussed such an alternative too. That has the downside that facts about immediates (e.g. that indices are in bounds) are not checked. That means you need to have two modes to the validation algorithm; one that skips immediates (i.e. basically a bytecode iterator) and one that does not.

Ah, right, I completely forgot about that problem! More broadly speaking, not validating dead code makes the difference between decoding errors and validation errors observable. Consequently, there was plenty of disagreement what should be considered a decoding error and what a validation error in such a setting, because engines were using different strategies or simply fused these algorithms. They would be forced to implement things in specific ways, duplicate logic, or run it in different modes. That might even have been the main argument that ultimately killed that option.

RossTate commented 3 years ago

Can you pick a concrete example that would best guide the discussion?

conrad-watt commented 3 years ago

Ah, right, I completely forgot about that problem! More broadly speaking, not validating dead code makes the difference between decoding errors and validation errors observable. Consequently, there was plenty of disagreement what should be considered a decoding error and what a validation error in such a setting, because engines were using different strategies or simply fused these algorithms. They would be forced to implement things in specific ways, duplicate logic, or run it in different modes. That might even have been the main argument that ultimately killed that option.

I can believe this, but the spec at least makes a clear distinction, and current reference tests make a distinction between decoding and validation errors (consider https://github.com/WebAssembly/spec/issues/1040). I appreciate that engines may have been free to surface the same errors for both/not consider the malformed text format tests.

I know we keep circling back to this point, but we'll need the opinions of current implementers to determine whether this change is feasible. V8 at least seems to have a pretty well-factored decoder, from my high-level snooping.

EDIT: also, this would only change the visibility of decoding vs validation errors for instruction sequences - other module well-formedness properties would still be ambiguous.

rossberg commented 3 years ago

I don't remember specific examples, but you can take every side condition in the binary grammar and in the validation rules and probably move it to either side -- which value ranges are enforced where, which bitfield encodings (or parts thereof), which type codes, etc etc.

The spec makes specific choices that best fit the semantics and structural invariants it wants to establish, but implementations tend to make completely different choices. The difference has already come up multiple times for the test suite, where people have argued for eliminating the distinction in the assertions because it's a poor match for their implementation choices (e.g. here).

V8, like most engines, fuses decoding with validation, so it would actually be pretty hard to disentangle them. IIRC, in these discussions implementers concluded that this would require a mode flag in almost every realistic implementation.

binji commented 3 years ago

cc @backes @lars-t-hansen @kmiller68

Can you provide some perspective on how this change would affect your implementations?

conrad-watt commented 3 years ago

While waiting, I had a quick look at V8, SpiderMonkey, and WebKit. Obvious caveat that I'm not trying to replace expert opinions, and I could have missed a standalone decoder implemented somewhere else in each repo.

V8: appears to use template parameters to generate two versions of the decode - one which performs validation and one which doesn't (e.g. https://chromium.googlesource.com/v8/v8/+/refs/heads/master/src/wasm/function-body-decoder-impl.h#2695 + https://chromium.googlesource.com/v8/v8/+/refs/heads/master/src/wasm/function-body-decoder-impl.h#1135)

SpiderMonkey: appears to interleave decoding and validation (e.g. https://searchfox.org/mozilla-central/source/js/src/wasm/WasmOpIter.h#1494)

WebKit appears to interleave decoding and validation (e.g. https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/wasm/WasmFunctionParser.h#L524)

So there could be some engineering/code duplication/template programming necessary to define skipToEnd(), especially for SpiderMonkey and WebKit. I'm hoping this investment of effort would still be outweighed by the design simplifications raised throughout this issue - but obviously this deserves comment from the actual engineers.

tlively commented 3 years ago

Clarifying question: I understood the proposal to not affect the validation of immediates, so literally just skipping to the end of the block without validating immediates would not be a correct implementation. Is that correct?

binji commented 3 years ago

AIUI, this rule C ⊢ e* : ⊥ -> t* means to skip validation for all e* instructions, including immediates.

conrad-watt commented 3 years ago

Clarifying question: I understood the proposal to not affect the validation of immediates, so literally just skipping to the end of the block without validating immediates would not be a correct implementation. Is that correct?

Currently with the spec as written, any restriction enforced as part of the module syntax would still need to be checked. So for example, decoding local.get i, it would still be required to check that i is an LEB representing (bounded within) a u32, but whether i is within the bounds of the function's declared local variables would not be checked.

fitzgen commented 3 years ago

@rossberg

Arguing against extra safety checks solely on the basis that they are harder to test is a rather problematic proposition.

I was not arguing that type checking unreachable code is harder to test and therefore we shouldn't do it. (Clearly our fuzzers are exercising these code paths very well, so it doesn't seem particularly hard to test!)

I was arguing that the fuzz bug count implied additional complexity, and that the complexity wasn't carrying its own weight in this case.

Can you explain how type checking unreachable code provides extra safety? (I re-skimmed this thread and didn't see any relevant links to historical discussions on this particular topic, apologies if I missed something!)

titzer commented 3 years ago

Actually, I think validating immediates but not types would be the worst of both worlds because skipToEnd would duplicate that logic. Most immediate forms are just i32 or u32 indices, which are easy to skip as a single case.

conrad-watt commented 3 years ago

Actually, I think validating immediates but not types would be the worst of both worlds because skipToEnd would duplicate that logic. Most immediate forms are just i32 or u32 indices, which are easy to skip as a single case.

Not the worst of both worlds! Far better than what we have currently, whatever we decide for this corner-case.

When I was looking through implementations, normally immediates (e.g. u32) were read and checked simultaneously by some readU32 function which specialised an underlying readLEB. It's not clear to me that it would simplify skipToEnd much to replace occurrences of readU32 with readLEB. In fact, it would probably make it harder for skipToEnd to share code with the regular decode+validate function (e.g. through something like V8's current templating approach).

EDIT: Skipping immediate range checks would also require more spec refactoring. I'm not denying that implementing skipToEnd requires some engineering, but it's confined to a pretty specific and well-behaved area of the implementation, and the effort has to be weighed against the costs we're incurring with the current approach (a validation algorithm severely complicated by polymorphic cases that only occur in dead code, and having to thread the "dead code" needle with the design and implementation of each new feature).

eqrion commented 3 years ago

I've also had negative experiences with the polymorphic stack typing rules, and tend to think they're more trouble then they're worth. For SM, this was mentioned, but we'd likely have to refactor our decoding and validation code significantly as they're fused together in a not-clean manner.

This is unfortunate, but if it could cut off these typing issues for good then I'd lean in favor of it. I recently did more work in this area with the f-r proposal and this logic is no longer easy to abstract away with stack operations like @titzer mentioned.

My ideal solution would be to treat all unconditional branches as equivalent to an 'end' instruction that terminates a block, effectively eliminating the definition of unreachable code we're working with here. Obviously that's a breaking change, but it'd make decoding/validation much easier.

taralx commented 3 years ago

Does anyone have concrete data on how often dead code is actually found in modules? Is there an existing producer that can output dead code?

kripken commented 3 years ago

@taralx I don't have comprehensive data, but I see unreachable code in wasm files on the web in practice, like a double unreachable, stuff like that. Many of those are on demos and examples, though, and some are simply not optimized builds.

I don't think any producer has a guarantee of not emitting dead code. It would be a bug in LLVM or Binaryen if they did, at least in optimized builds, but I'm not aware of work to fuzz them to make sure they have no dead bytes at all (we do measure code size a lot, though, so any such amount would be tiny - but it might exist).

taralx commented 3 years ago

Ok, but it does happen, so simply banning dead code would risk enough breakage that it isn't worth it. Thanks.

conrad-watt commented 3 years ago

The "short-term" version of this proposal will (time permitting!) be discussed for phase 1 in the next CG meeting. We're not looking for firm commitment from implementers yet; we just want to create the proposal repository in order to present a consolidated rationale + concrete spec changes.