Proposal: Add type for unreachable states

WebAssembly / design

WebAssembly Design Documents

http://webassembly.org

Apache License 2.0

11.38k stars 693 forks source link

Proposal: Add type for unreachable states #1379

Open RossTate opened 3 years ago

RossTate commented 3 years ago

Problem

The current design of the unreachable instruction (and br and the like) imposes an additional decidability constraint on WebAssembly: for any instruction sequence, we need to be able to determine if there exists a typing for the sequence.

This problem will grow more difficult as WebAssembly grows (until we make it trivial with this proposal). Even just the small complexity introduced by adding reference types has caused a bug in the reference interpreter (WebAssembly/reference-types#111), imposing yet another layer of complexity on the reference interpreter solely for type-checking unreachable code. This will be made worse by br_on_null in the typed references proposal, which will require adding a "unknown reference type" case.

This decidability constraint seems to already be restricting the design of WebAssembly. For example, here in #1365 it is raised as an issue for adding a dup instruction. This is because (unannotated) dup and pick instructions would require you to reason that the duplicated type is used consistently across multiple instructions, requiring the type-checker to track constraints on type variables and ensure they're satisfiable, all for the sake of just type-checking unreachable code.

Short-Term Solution

A short-term solution is to add an unreachable type (it seems specifically to be a result type) to the typing rules but not to the grammar. Any (syntactically valid) instruction sequence would have type unreachable -> unreachable. Because this changes nothing besides validation and accepts strictly more programs, we could make this change to any proposal spec and it would be backwards compatible.

This solution does not seem to have been discussed in the design rationale. It also seems to be compatible with all the desirable properties provided there except for the following:

It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).

However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).

Long-Term Solution

A long-term solution would extend the grammar to account for an unreachable type. For example, a function could declare its type as [i32 i32] -> unreachable. WebAssembly compilers could even use this information to optimize calls to such functions, i.e. no need to save registers before the call if the call will never return! (Though of course you still have to consider exceptions.) This solution would involve many more design decisions, some of which might be informed by other discussions that are in flight, so it would probably be best to explore this option as part of some larger effort rather than as an isolated proposal.

rossberg commented 3 years ago

@fitzgen:

I was arguing that the fuzz bug count implied additional complexity,

Perhaps, but my point was that that is likely to be true for any form of checks, so is problematic as an argument in itself.

and that the complexity wasn't carrying its own weight in this case.

Right, and that is the discussion we should focus on.

Can you explain how type checking unreachable code provides extra safety? (I re-skimmed this thread and didn't see any relevant links to historical discussions on this particular topic, apologies if I missed something!)

I believe I mentioned it a couple of times: as I remember it from the early meetings, the safety/security concern (not mine) was that malformed dead code could be part of an attack vector (e.g., if an engine mishandles jumps somehow and allows executing such code although it never should; if such code is at least validated, the amount of harm that can be done that way is more confined). So, reducing the risk of single points of failure. The discussions happened in the early days before we got into the habit of note taking or written paper trails, so nothing I could point to. But some folks here might remember the specific safety/security angle better than I.

RossTate commented 3 years ago

My understanding is that the primary (sole?) concern has to do with the uncertainty about what instructions should be (in)valid in the unreachable state. I have been confused by this concern, and some examples would have helped, but I think I finally figured out how to explain my confusion and at the same time address the issue.

Here is the code for pop in the reference interpreter (lines 101-104, unchanged by WebAssembly/reference-types#116):

let push (ell1, ts1) (ell2, ts2) =
  assert (ell1 = NoEllipses || ts2 = []);
  (if ell1 = Ellipses || ell2 = Ellipses then Ellipses else NoEllipses),
  ts2 @ ts1

Change those last two lines to

  if ell1 = Ellipses || ell2 = Ellipses then (Ellipses, []) else (NoEllipses, ts2 @ ts1)

That change implements this proposal (and is compatible with the changes in WebAssembly/reference-types#116, though they now will have no observable effect).

What this change does is make it so that, once we reach the unreachable state (represented as (Ellipses, []) in the reference interpreter), then every subsequent instruction will be type-checked in that state (until the end of the enclosing block is reached). In other words, we already have consensus on whether a single instruction following the unreachable instruction is valid, and this proposal simply says to validate all instructions until the end of the enclosing block as if they immediately followed the unreachable instruction.

Note that there is also another way to implement this change. Rather than represent the stack type as ellipses * value_type option list (as in line 73), one could represent the stack type as (value_type list) option. One could then either continue to use the push/pop abstraction (using lists of value_type option), or one could take a bimodal approach and branch on the state of the stack type. This illustrates that this proposal offers more flexibility on how to implement type-checking, whereas WebAssembly/reference-types#116 bakes a particular implementation strategy into the spec.

As for the spec, I think the following changes are all that are necessary to achieve the above:

Introduce a metavariable to represent stack types and change stack-polymorphic rules to use this metavariable. (This just makes the presentation more extensible and itself has no effect on validation.)
Add an unreachable state type to the grammar for that metavariable. (On its own, this also has no effect on validation.)
Add a rule that says an instruction can have the unreachable input state type and an arbitrary output state type if there exist input and output state types that would make the instruction type-check. (This single rule is the only real change to validation.)

Hopefully this illustrates how straightforward this change is to the spec; it only appears to have potential for complications because the spec did not use a metavariable to represent stack types.

rossberg commented 3 years ago

The reference interpreter is meant to be an executable version of the spec. As such, it separates decoding from validation, like the spec, but unlike real engines. Its validator also tries to stick as close to the structure of the declarative typing rules from the spec as is possible, very unlike real engines, which all use a version of the forward algorithm from the appendix. So you can't draw any meaningful conclusions about the practical impact of this change from the interpreter.

RossTate commented 3 years ago

@rossberg The implementers can speak for themselves. (So far, from what I can tell every implementer has been supportive of the overall change.) Did the comment above address your own concerns?

conrad-watt commented 3 years ago

I agree that the real point of interest here is the extent to which real implementations that fuse decoding, syntactic checks, and validation are able to handle this change. We've heard some voices in support, but I agree that more discussion is needed to make sure everyone is on the same page about what the changes would entail.

EDIT: FWIW I plan to lay more of this out when our proposal repo is created

rossberg commented 3 years ago

@RossTate, I don't understand your response. My comment is pointing out that your analysis of the reference interpreter isn't implying anything about other implementations. I'm not implying anything about other implementations myself in that comment.

RossTate commented 3 years ago

I understand. And I did not say anything about other implementations in my comment as well. But my comment was primarily aimed at addressing your concerns, so I was hoping you would comment as to whether it did so effectively. If your only concern is that others might have some difficulty implementing this proposal, then I would like to hear about real issues from them rather than hypothesize. So can you clarify if you have any other remaining concerns?

conrad-watt commented 3 years ago

Just to preempt a response from @rossberg, I feel that we have a strong case that the burden on real implementations will not be too great, but I'd rather lay it out in full in the proposal repo and have the discussion continue from there, rather than expend (probably wasted) effort on a back-and-forth in the bowels of this initial issue.

RossTate commented 3 years ago

I agree that it's a good plan to discuss implementation specifics in the full proposal repo. I also would find it useful to know, though, if there are any concerns besides implementation specifics.

rossberg commented 3 years ago

@RossTate, the change to spec -- and therefore the reference interpreter -- are not the issue, they would be fairly straightforward and akin to what's in WebAssembly/reference-types#116, except that the bottom type isn't added to opdtype but to stacktype directly, and there is a subsumption rule between bottom and regular stack types (merely having a special sequencing rule is not enough, since bottom must also match all types at the end of a block, expression, or function).

To repeat my concerns:

Formally, this semantics has slightly less nice properties, because composability of well-typedness is no longer invertible. This is not a big deal, though.
Security-wise, we potentially lose one level of defence against certain attacks. This certainly was a significant concern last time round, but it is difficult to quantify.
Conceptually, the change makes the parsing/validation distinction observable. That takes away a relevant degree of freedom from both spec and implementations. And ensuring that impls get it right will require a lot of subtle tests, even more so than under the current semantics. (Related, some changes we did in proposals would technically become breaking changes under this semantics, e.g., reinterpreting the alignment immediate of load/store instructions as a bit field in the multi memory proposal.)
Practically, this change may hence be counterproductive for some implementations, i.e., make things harder not simpler. We have at least anecdotical evidence for this through previous complaints from implementers about the test suite making the distinction, even though they were allowed to ignore it so far (and did).

The last two points are the ones where I believe you severely underestimate the real-world complexities and the amount of churn this will produce. The devil is in the details on this one. As a former engine implementer I assume some expertise in making such predictions. ;)

With that all said, the group seems to think that this is time well spent at the moment, and I am interested in seeing the proposal explored. I just hope it won't tie our hands too much in the future.

tlively commented 3 years ago

@rossberg

(Related, some changes we did in proposals would technically become breaking changes under this semantics, e.g., reinterpreting the alignment immediate of load/store instructions as a bit field in the multi memory proposal.)

This sounds concerning and I would like to understand it more. Can you elaborate on this example and explain why it would be become a breaking change under this proposal? It would be good to have a dedicated discussion on this point on an issue at https://github.com/WebAssembly/relaxed-dead-code-validation.

conrad-watt commented 3 years ago

Andreas' point is this: the original encoding for (e.g.) i64.load is 0x29 memarg where memarg := a:u32 o:u32

however, because 2^a is never allowed to exceed the natural byte alignment of the largest load (i.e. a must be leq 3) as a validation-time restriction, any higher-order bits are free to use as a flag.

In the multi-memory proposal, if the MSB of the first LEB byte of a is set to 1 (which would previously have been a validation error because it would make a too big), then decoding of the load consumes an extra immediate which represents the memory index.

If this proposal were naively adopted, a load with this bit set would not have been a validation error in dead code, so this trick wouldn't be/have been backwards-compatible.

It's a good point and I have it on my radar for discussion. I plan to write up something addressing this once I've tested the existing behaviours of implementations a little more stringently.

tlively commented 3 years ago

Makes sense, thanks!

titzer commented 3 years ago

My initial warming to this proposal was under the assumption that skipToEnd also skipped validation of indices, which is not what is being proposed. As such, it is more decodeToEnd with validation of immediates. In a typical engine decoder designed for speed, either of the two would probably be better implemented with a subroutine and not mixed into the bytecode-by-bytecode cases. The decodeToEnd routine is a little messier so I am less warm than before. It pretty much necessitates a table mapping bytecodes to immediate kinds rather than duplicating the big switch over bytecodes. If your decoder already has that mapping and is factored to deal with "immediate kinds", it's not so invasive of a change.

But since we're splitting hairs here:

The approach of trying to hide everything behind pop operations starts to break down and is very clunky in the face of bytecodes like return_call_ref which pops a typed funcref off the stack and must check that its return types match the return types of the enclosing function. I'm not sure how you handled that in the reference interpreter but that was the point where I basically gave up and directly return a pair from pop(), one of whose elements indicate the value type returned is invalid and instead the stack is unreachable (or a maybe value type if you prefer to think of it that way). Other bytecodes that are particularly weird are call_ref and func.bind (perhaps in these cases just returning a function [] -> [] is OK? that basically skips them...). Also ref.is_null, ref_as_non_null...the general trend is that any bytecode whose stack effects (not just result type) is partially or wholly determined by the types of its input operands, and not derivable solely from immediates, needs special consideration. It might seem natural with some tricks in the formalism, but every single one of those is a head-scratcher upon first inspection.

If immediates are validated, I don't understand how that makes the distinction between decoding and validation observable. Can you explain that, @rossberg ?

conrad-watt commented 3 years ago

It pretty much necessitates a table mapping bytecodes to immediate kinds rather than duplicating the big switch over bytecodes. If your decoder already has that mapping and is factored to deal with "immediate kinds", it's not so invasive of a change.

From my inexpert look at implementations, at least the browser engines already seem that way.

If immediates are validated, I don't understand how that makes the distinction between decoding and validation observable. Can you explain that, @rossberg ?

It's not observable in the sense that a distinguished error message is required, but (the current iteration of) the proposal would make it observable whether a particular immediate check is spec'd as part of the instruction syntax or as part of validation.

e.g. with the current proposal

function [] -> []
(locals i32 i32 i32)
local.get 50
unreachable

the above would give an error

function [] -> []
(locals i32 i32 i32)
unreachable
local.get 50

the above would not be an error, while both

function [] -> []
(locals i32 i32 i32)
local.get (2^40)
unreachable

and

function [] -> []
(locals i32 i32 i32)
unreachable
local.get (2^40)

would give errors.

That being said, I'm leaning towards a revised proposal (which I was just typing up an issue in the repo on!) where the distinction isn't between "instruction syntax vs validation" checks (which I agree are driven entirely by how the spec is factored), but between "type stack-dependent and type stack-independent" checks, where only "type stack-independent" checks are performed in dead code. This would make dead code strictly more checked that this iteration of the proposal, but would still avoid a polymorphic stack (all of the above examples would be errors with this proposal).

This would potentially require more change to the formalism of the spec, but avoids creating gotchas like the load example Andreas gave, while still addressing the motivating desires/frustrations of implementers. I think both iterations of the proposal would be roughly the same difficulty to implement.

RossTate commented 3 years ago

between "type stack-dependent and type stack-independent" checks, where only "type stack-independent" checks are performed in dead code

This sounds like what I suggested above, which is why I was confused @rossberg still had objections.

This would potentially require more change to the formalism of the spec

I think the single rule I gave above captures this (provided the change to describing stack types via a metavariable.)

conrad-watt commented 3 years ago

I think the single rule I gave above captures this (provided the change to describing stack types via a metavariable.)

This is enough to deal with the MVP instructions, but something more is needed to deal with post-MVP instructions such as (rtt.sub <type> (<rtt>)), where the comparison between its immediate type parameter and its stack operand would only be carried out if the input stack is not unreachable.

EDIT: @titzer's "pop-abstraction-breaking" instructions are probably better examples (https://github.com/WebAssembly/design/issues/1379#issuecomment-702725791). We're continuing this conversation in the proposal repo (https://github.com/WebAssembly/relaxed-dead-code-validation/issues/1)