WebAssembly / exception-handling

Proposal to add exception handling to WebAssembly

https://webassembly.github.io/exception-handling/

Other

162 stars 35 forks source link

Any formal execution definition of exception handling? #87

Closed sys5867 closed 3 years ago

sys5867 commented 5 years ago

Hello.

I'm interested in formal execution definition of WASM with exception handling proposal included.

What I now think is that the try catch end block semantics with throw instruction can be described in reduction rules, similar to PLDI'17 MVP spec paper.

Also, I heard from @aheejin that @rossberg has done some work on the formal spec of the proposal.

Are there in-progress works related to formal definition of exception handling?

rossberg commented 5 years ago

There is nothing written up in publishable form yet. So let me just dump the rules as an extension to the PLDI paper (or rather, the somewhat updated CACM version) in ASCII art here. Hope that helps answering your question. :)

Abstract Syntax

Value Types

t ::= ... | exnref

Exception Types

et ::= t*

Instructions

e ::= ... | throw x | rethrow | try ft e* catch e* end | br_on_exn l x

Exceptions

exn ::= ex* exception et | ex* exception et im

Modules

mod ::= module ... exn*

Typing

Contexts

C ::= {..., exn et*}

Rules

C_exn(x) = t*
----------------------------
C |- throw x : t1* t* -> t2*

--------------------------------
C |- rethrow : t1* exnref -> t2*

ft = t1* -> t2*
C, label t2* |- e1* : t1* -> t2*
C, label t2* |- e2* : exnref -> t2*
-----------------------------------
C |- try ft e1* catch e2* end : ft

C_label(l) = C_exn(x)
-------------------------------------
C |- br_on_exn l x : exnref -> exnref

Reduction

Stores

S ::= {..., exn ei*}

Exception Instances

ei ::= {type et}

Module Instances

m ::= {..., exn a*}

Values

v ::= ... | exn a v*

Administrative Instructions

e ::= ... | throw a | catch_n{e*} e* end | exn a v*

Throw Contexts

T ::= v* [_] e* | label_n{e*} T end | catch_n{e*} T end | frame_n{F} T end

Rules

F; throw x  -->  F; throw a
  (iff F_exn(x) = a)

v^n (try ft e1* catch e2* end)  -->  catch_m{e2*} (label_m{} v^n e1* end) end)
  (iff ft = t1^n -> t2^m)

catch_m{e*} v^m end  -->  v^m

S; F; catch_m{e*} T[v^n (throw a)] end  -->  S; F; label_m{} (exn a v^n) e* end
  (iff S_exn(a) = {typ t^n})

(exn a v*) rethrow  -->  v* (throw a)

ref.null rethrow  -->  trap

F; (exn a v*) (br_on_exn l x)  -->  F; v* (br l)
  (iff F_exn(x) = a)

F; (exn a v*) (br_on_exn l x)  -->  F; (exn a v*)
  (iff F_exn(x) =/= a)

F; ref.null (br_on_exn l x)  -->  F; trap

sys5867 commented 5 years ago

Thank you for the response! I have additional questions about the semantics provided.

In the proposal, br_on_exn pops the exnref values from the stack and instead pushes the exception's argument values on top of the stack, if given exception index matches. The typing rule for br_on_exn then should assume exnref value on the top, and result in t* values afterwards. In other words, the typing rule should be,

C_label(l) = C_exn(x) = t*
-------------------------------------------
C |- br_on_exn l x : exnref -> t*

Also in the reduction rule, exception values are represented as exn a v*. Reduction rules for rethrow and br_on_exn assumes that exception arguments v* resides in operand stack. In implementation, where exception argument values are actually stored? From my understanding, only exnref value stays at the stack and exception arguments are stored in linear memory space, where the exnref value points to. Is this correct interpretation, or am I missing something?

rossberg commented 5 years ago

No, the typing rule for br_on_exn is correct as given. Keep in mind that the values are only pushed onto the stack if the branch is taken, while the typing rule describes the stack manipulation of linear execution.

Exnref is an abstract (reference) type. It is an implementation detail how and where its argument values are stored. It is separate from any linear memory.

rossberg commented 4 years ago

@sys5867, sorry, as was just pointed out to me by @ioannad, there indeed was a bug in the typing rules for br_on_exn: the t* in the conclusion should not be there, the overall type should just be exnref -> exnref. Fixed now.

ioannad commented 4 years ago

@rossberg , with the addition of throw contexts, invoke might end up in a state where it neither succeeded with values nor has trapped, but is some throw context with a throw waiting for its catching try block. The simplest example is when invoking the function wasm_throw defined below:

(module  
   (type (func))
   (exception (export "exn") (type 0))
   (func (export "wasm_throw") (type 0)
      throw 0))

Invoking wasm_throw gets "stuck" in the state store; frame_1{F'} label_1 {} (throw tag) end end, where tag is the event address of exn, and F' the frame with the above module's instance.

I suppose this is the case meant by "the embedder defines how to handle the uncaught exception". But in this case it is not true any more that "Reduction terminates when a thread’s instruction sequence has been reduced to a result, that is, either a sequence of values or to a trap." (e.g., in the definition of evaluation contexts).

Should some other form of "result" (which is not a sequence of values or a trap) be defined to cover such cases of unresolved throw contexts?

rossberg commented 4 years ago

@ioannad, yes, correct: with exceptions we will naturally need to extend the notion of result to include throw a. And the rules for reducing evaluation contexts will also need a couple of extra rules to get there.

Basically, it all needs to mirror what we have for traps. Except that there also is the try rule to intercept a throw.

ioannad commented 4 years ago

EDIT: A reworked version of this formal spec overview is now added as a file with PR #143 .

I leave the text below for reference, but please note this spec below is not correct.

3rd proposal formal spec

This is an initial attempt to clarify the additions to formal spec in the new EH proposal (as of September 18, 2020).

I wrote this "informal formal spec" roughly in the style of the previous informal formal spec above.

The abstract syntax was given by @aheejin. This is my understanding of the typing and reduction steps for the new (old) instructions, mainly based on the the 1st proposal and several discussions in issues.

In this I assume rethrow has no immediate, as it is the current stand. This could be easily modified, whichever way we decide to go with rethrow.

I have a couple of questions inline below, but I opened separate issues for those, to keep this issue about fixing the formal spec.

Abstract Syntax

Exception Types

exntype ::= [t*] -> []

Instructions

instr ::= ... | throw x | rethrow
        | try bt instr* (catch exnidx instr*)+ end
        | try bt instr* (catch exnidx instr*)* catch_all instr* end
        | try bt instr* catch_br l
        | try bt instr* unwind instr* end

Exceptions (definitions)

exn ::= export* exception exntype  | export* exception exntype import

Modules

mod ::= module ... exn*

Typing

To verify that a try...catch_br l instruction refers to a label introduced by a try-catch or try-catch_br block (I'll call this a try-catch label), this introduces a type attribute to labels in the validation context, which attribute is set to try_catch when the label is a try-catch label and empty otherwise. The original notation label [t*] will then be a shortcut for label {result [t*], type ε}. Labels introduced by try-unwind blocks do not get marked with the try_catch attribute.

To validate rethrow which can only appear inside a catch block, introduce a new list of currently caught exceptions, denoted by caught. I think of caught as a mark that we are inside a catch block.

Contexts

C ::= {..., (exn exntype)*, caught*}

Rules

C_exn(x) = [t*] -> []
--------------------------------
C |- throw x : [t1* t*] -> [t2*]

C_caught =/= ε
--------------------------------
C |- rethrow : [t1*] -> [t2*]

bt = [t1*] -> [t2*]
C, label {result [t2*], type try_catch} |- instr* : [t1*] -> [t2*]
(C, caught, label [t2*] |- instr_i* : [] -> [t2*])^n
(C, caught, label [t2*] |- instr'* : [] -> [t2*])?
------------------------------------------------------------------
try bt instr* (catch x_i instr_i*)^n (catch_all instr'*)? end : bt

bt = [t1*] -> [t2*]
C, label {result [t2*], type try_catch} |- instr* : [t1*] -> [t2*]
C_label(l).type = try
------------------------------------------------------------------
try bt instr* catch_br l : bt

bt = [t1*] -> [t2*]
C, label [t2*] |- instr_1* : [t1*] -> [t2*]
C, label [] |- instr_2* : [] -> []
-------------------------------------------
try bt instr_1* unwind instr_2* : bt

The last rule above could be modified to allow the unwind block to return values, for example a default value that should be returned when the instructions of the try block throw an exception. This seems unneccessary because unwind probably just concerns side effects (open/close a file, alter a mutable global, etc). (Issue #129)

Reduction

In order to describe the interaction of try...catch_br l with try...unwind...end, an administrative instruction caught_m is added, to help identify the destination of a throw (when/if found) before performing the stack unwinding. So caught_m marks a suspended state, where a throw destination is found, and the unwinding is about to happen, but as we unwind we must make sure we perform any cleanup forms when exiting administrative unwind frames.

To make sure we don't execute any instructions that appear after the throw, a further administrative instruction throw_point is added alongside with caught_m.

These new administrative instructions caught_m and throw_point are not necessary if catch_br l does not trigger any unwind-cleanup code between itself and its destination label l. (Issue #130)

Moreover, apart from throw contexts I added "protected contexts", to talk about the case when an exception is thrown and caught outside a try ... unwind instruction surrounding the throw.

Throw contexts now include catch_br{l} T end to distinguish protected contexts by their use of unwind.

Stores

S ::= {..., exn exninst*}

Exception Instances

exninst ::= {type exntype}

Module Instances

m ::= {..., (exn a)*}

Administrative Instructions

instr ::= ... | throw a | catch_n{a instr*}*{instr*} instr* end
        | catch_br{l} instr* end | unwind{instr*} instr* end
        | caught_m{a val^n}{instr*} instr* end | throw_point

Protected Contexts

P ::= val* [_] instr* | label_n{instr*} P end | unwind{instr*} P end |
    | catch_br{l} P end | frame_n{F} P end

Throw Contexts

T ::= ... | catch_br{l} T end

Rules

F; throw x  -->  F; throw a  (iff F_exn(x) = a)

F; caught_m{a val^n}{instr_1* rethrow instr_2*]} T[throw_point] end
  --> F; instr_1* val^n (throw a)

val^n (try bt instr* (catch x_i instr_i*)* (catch_all instr'*)^k end)
  -->  catch_m{a_i instr_i*}*{instr''*} (label_m{} val^n instr* end) end
  (iff bt = [t1^n] -> [t2^m] and (if k=1 and instr'*=ε then instr''* = nop)
  and (if k=1 and inst'*=/=ε then instr''* = instr'*) and (if k=0 then instr''*=ε).

S; F; catch_m{a_i instr_i*}*{instr'*} P[val^n (throw a)] end
  -->  S; F; caught_m{a val^n}{val^n instr_i*} P[throw_point] end
  (iff S_exn(a) = {type [t^n]->[]} and a_i = a)

S; F; catch_m{a_i instr_i*}*{instr*} P[val^n (throw a)] end
  -->  S; F; caught_m{a val^n}{instr*} P[throw_point] end
  (iff S_exn(a) = {type [t^n]->[]} and instr =/= ε and for every i, a_i =/= a)

val^n (try bt instr* catch_br) --> catch_br{l} (label_m{} val^n instr* end) end
  (iff bt = [t^n] -> [t^m])

catch_m{a_i instr_i*}*{instr*} (label_m{} B^l[ catch_br{l} (P[val^n (throw a)]) end] end) end
  --> caught_m{a val^n}{instr'*} (label_m{} B^l[ P[throw_point] ] end) end

val^n (try bt instr* unwind instr'* end) --> unwind{instr'*} (label_m {} val^n instr* end) end
  (iff bt = [t^n] -> [t^m])

F; caught_m{a val^n}{instr*} ( P[unwind{instr'*} T[throw_point] end]) end
  --> caught_m{a val^n}{instr*} P[instr'* throw_point] end

F; caught_m{a val^n}{instr*} T[throw_point] end --> label_m{} instr* end

F; caught_m{a val^n}{instr*} val^m end  -->  val^m

I would be happy if anyone could point out things that are wrong or that I misunderstood. (cc. @aheejin)

RossTate commented 4 years ago

I can't tell what the semantics of rethrow is when it occurs inside a try inside a catch. Based on the implications of the discussion in #126, try (throw $exn) catch $exn (try rethrow catch $exn instr* end) should reduce to throw $exn rather than instr*. That is, the rethrown exception is rethrown from the place where it was caught rather than from the location of the rethrow instruction. This semantics would also address forwards-compatibility concerns.

rossberg commented 4 years ago

Good stuff. It's great to have a semantics written down and see the open questions crystalise more clearly.

It is unfortunate that we need ad-hoc flags and markers in the typing rules to deal with the lack of composability in this version of the proposal, but I see no way around it. The typing rules seem right. As for the type of the unwind body, I think in principle we could allow a polymorphic result type, since it essentially ends in a branch, but that seems useless and would just complicate the operational semantics.

I'm a bit more confused about the operational semantics. Can you explain the role of the new P contexts and their distinction from T? I am not sure why catch_br appears in both -- or, in fact, in either of them. For example, the reduction rules for throw would seem to allow ignoring intermediate catch_br, such that

(try $outer
  (try $inner
    (try
      (throw $e)
      (catch_br $outer)
    )
    (catch ...)  (1)
  )
  (catch ...)  (2)
)

would go to (1), not (2) as it should.

F; caught_m{a val^n}{instr_1* rethrow instr_2*]} T[throw_point] end
  --> F; instr_1* val^n (throw a)

Hm, this looks odd. I believe it should be something like

F; caught_m{a val^n}{instr*} T[rethrow] end
  --> F; caught_m{a val^n}{instr*} T[val^n (throw a)] end

val^n (try bt instr* (catch x_i instr_i*)* (catch_all instr'*)^k end)
  -->  catch_m{a_i instr_i*}*{instr''*} (label_m{} val^n instr* end) end
  (iff bt = [t1^n] -> [t2^m] and (if k=1 and instr'*=ε then instr''* = nop)
  and (if k=1 and inst'*=/=ε then instr''* = instr'*) and (if k=0 then instr''*=ε).

Can you explain the need for the complex side condition on this one? I would have expected something like

val^n (try bt instr* (catch x_i instr_i*)* (catch_all instr'*)^? end)
-->  catch_m{a_i instr_i*}*{instr'*}^? (label_m{} val^n instr* end) end
(iff bt = [t1^n] -> [t2^m])

catch_m{a_i instr_i*}*{instr*} (label_m{} B^l[ catch_br{l} (P[val^n (throw a)]) end] end) end
  --> caught_m{a val^n}{instr'*} (label_m{} B^l[ P[throw_point] ] end) end

Something is missing here, I think: the outer catch first needs to check if it actually handles the exception. The instr'* on the rhs is coming out of nowhere, but should be the matched handler. Moreover, a catch_br could also target another catch_br, according to the typing rules, so something inductive is required.

There also does not seem to be a rule for the case where a catch does not handle a thrown exception, and it instead has to find an outer handler.

I suspect this all will get extremely complicated (or even impossible) with the current approach. The only way I can think of to keep the rules tractable is by constructing the continuation incrementally. That could be done by reifying it as an evaluation context in the administrative throw instruction that gets extended. For example,

instr ::= ... | throw{E} a

F; throw $x  -->  F; throw{[]} a  (iff F_exn(x) = a)

catch_m{a_i instr_i*}*{instr*} (label_m{} B^l[ catch_br{l} T[val^n (throw{E} a)] end] end) end
  --> val^n (throw{caught_m{a val^n}{instr'*} (label_m{} B^l[ T[E] ] end) end} a)

S; F; catch_m{a_i instr_i*}*{instr'*} T[val^n (throw{E} a)] end
  -->  S; F; caught_m{a val^n}{val^n instr_i*} T[E[throw_point]] end
  (iff S_exn(a) = {type [t^n]->[]} and a_i = a)

and so on.

But in fact, if I read the answer to #130 correctly (I'm not sure I fully understand what the intended interaction is), then this could be simplified by not gathering contexts in the throw, but merely the sequence of unwind instructions to execute once the handler is found. That would also allow to get rid of the throw_point:

instr ::= ... | throw_n{instr*} a

S; F; val^n (throw $x)  -->  F; throw_n{} a  (iff F_exn(x) = a and S_exn(a) = {type [t^n]->[]})

unwind{instr'*} T[val^n (throw_n{instr*} a)] end
  --> val^n (throw_n{instr* instr'*} a)

catch_m{a instr*}*{instr'*}? (label_m{} B^l[ catch_br{l} T[val^n (throw_n{instr''*} a')] end] end) end
  --> catch_m{a instr*}*{instr'*}? val^n (throw_n{instr''*} a') end

catch_m{a instr*}*{instr'*} val^n (throw_n{instr''*} a') end
  --> instr''* (caught_m{a val^n} (instr*)*[i] end)
  (iff a* = a^i a' a*)

This looks simpler and would clearly encode the intention that the unwind phase follows the catches.

However, I'm not sure what is supposed to happen if an unwind block itself throws... Should the handler still be active, although it already caught? In that case, some additional wrapping around instr''* would be required in the last rule, and it may get complicated again.

F; caught_m{a val^n}{instr*} ( P[unwind{instr'*} T[throw_point] end]) end
  --> caught_m{a val^n}{instr*} P[instr'* throw_point] end
I believe the enclosing caught_m and P is redundant in this rule.

ioannad commented 4 years ago

Thank you for the feedback, @rossberg and @RossTate. Sorry for the delay, I was tied up until today (and btw had to miss yesterday's SG meeting as well).

@RossTate my understanding of rethrow is that it creates a new exception, that's why I specified that the rethrown exception is rethrown from the place of the rethrow instruction. Am I misunderstanding something?

@rossberg thank you so much for the detailed review! Some quick remarks before I start reworking the formal spec to incorporate your suggestions and comments.

Can you explain the role of the new P contexts and their distinction from T?

My intention was to distinguish the case when a unwind block is involved. Your suggestions and the conclusion of #130 might make these unnecessary though.

the reduction rules for throw would seem to allow ignoring intermediate catch_br, such that
(try $outer
 (try $inner
   (try
     (throw $e)
     (catch_br $outer)
   )
   (catch ...)  (1)
 )
 (catch ...)  (2)
)
would go to (1), not (2) as it should.

My understanding is that any instruction after a throw is dead code, is that wrong?

I really like your suggestion of the new administrative throw{E} a, this might simplify things a lot. I'll work through the semantics and update my comment when I'm ready with it.

val^n (try bt instr* (catch x_i instr_i*)* (catch_all instr'*)^k end)
  -->  catch_m{a_i instr_i*}*{instr''*} (label_m{} val^n instr* end) end
  (iff bt = [t1^n] -> [t2^m] and (if k=1 and instr'*=ε then instr''* = nop)
 and (if k=1 and inst'*=/=ε then instr''* = instr'*) and (if k=0 then instr''*=ε).

Can you explain the need for the complex side condition on this one?

This was an attempt to distinguish the case when there is no catch_all, wrt the rule

S; F; catch_m{a_i instr_i*}*{instr*} P[val^n (throw a)] end
  -->  S; F; caught_m{a val^n}{instr*} P[throw_point] end
  (iff S_exn(a) = {type [t^n]->[]} and instr =/= ε and for every i, a_i =/= a)

About throw_point, I think the conclusion of #130 makes it unnecessary anyway. I might be able to explain this better when I reworked the rules with your suggestions and my new understanding of the operational semantics.

rossberg commented 4 years ago

the reduction rules for throw would seem to allow ignoring intermediate catch_br, such that
(try $outer
 (try $inner
   (try
     (throw $e)
     (catch_br $outer)
   )
   (catch ...)  (1)
 )
 (catch ...)  (2)
)
would go to (1), not (2) as it should.
My understanding is that any instruction after a throw is dead code, is that wrong?

No, but the catch_br here is part of the try, not an instruction after throw. (Ah, the text format actually requires writing this as (try (do (throw $e)) (catch ...)), which makes that clearer.)

RossTate commented 4 years ago

@RossTate my understanding of rethrow is that it creates a new exception, that's why I specified that the rethrown exception is rethrown from the place of the rethrow instruction. Am I misunderstanding something?

@ioannad But in what context/continuation is that exception thrown? The answer affects how this example behaves: try (throw $exn) catch $exn (try rethrow catch $exn instr* end) end. The reasoning in #126 for removing the index only works if the continuation for the rethrown exception is whatever the continuation of the outer try/catch is rather than the continuation of the rethrow instructions itself.

ioannad commented 4 years ago

No, but the catch_br here is part of the try, not an instruction after throw. (Ah, the text format actually requires writing this as (try (do (throw $e)) (catch ...)), which makes that clearer.)

@rossberg oh, right, of course! I guess the notation and indentation threw me off.

But in what context/continuation is that exception thrown? The answer affects how this example behaves: try (throw $exn) catch $exn (try rethrow catch $exn instr* end) end. The reasoning in #126 for removing the index only works if the continuation for the rethrown exception is whatever the continuation of the outer try/catch is rather than the continuation of the rethrow instructions itself.

@RossTate I'm not sure I understand what reasoning you mean. I think the operational semantics I gave for rethrow are equivalent to the ones of the 2nd proposal. Is this not correct? I'm reworking the reduction steps anyway, but since rethrow is an already implemented instruction, I am reluctant to change its operational semantics, at least until #126 and #127 are decided.

rossberg commented 4 years ago

Ah yes, in the presence of two-phase unwind (which effectively is a limited form of resumption), rethrow has to be more explicit about maintaining the original throw point. I think that semantics can only be modelled correctly if the throw context or unwind instructions is reified in the exception like suggested above, and also remembered in the caught form. It's similar to how the semantics for effect handlers works, except that the effect value is second-class.

I think something like

F; caught_m{a val^n}{E}{instr*} T[rethrow] end
  --> F; val^n (throw{E} a)

or similarly with instr* instead of E. (Quick answer for lack of time, might not be exactly right, but you probably see the idea.)

This is mostly unrelated to crippling the scoping capabilities of rethrow, though.

RossTate commented 4 years ago

@rossberg It sounds like you are saying my example, try (throw $exn) catch $exn (try rethrow catch $exn instr* end) end, should reduce to throw $exn. Can you confirm?

rossberg commented 4 years ago

I don't think one reduces to the other, but both should be observably equivalent AFAICS.

RossTate commented 4 years ago

Okay. That semantics is fine with me. (And it could equivalently be expressed by having end for catch/catch_all always rethrow the exception and replacing rethrow with a branch to the end.)

ioannad commented 4 years ago

With the syntax and semantics of the 2nd proposal I think we have the following.

For bt = [] -> [t^n], bt' = [exnref] -> [t^n], F_exn(x)=a, and instr*:[exnref] -> [t^n] we have the reductions:

(try bt (do (throw x)) (catch (try (do (rethrow)) (catch instr*))))
↪
catch_n{(try (do (rethrow)) (catch instr*))} ( label_n{} throw a end ) end

For T = label_n{} [_] end the catch_n above catches the thrown a and the above reduces to:

label_n{} (ref.exn a) (try bt' (do (rethrow)) (catch instr*)) end
↪
label_n{} ( catch_n{instr*} ( label_n{} (ref.exn a) (rethrow) end ) end ) end
↪
label_n{} ( catch_n{instr*} ( label_n{} (throw a) end ) end ) end

Which, for the same Tas above, catch_n catches the throw and reduces to label_n{} ( label_n{} (ref.exn a) instr* end ) end.

The latter translated to the 3rd proposal (and ignoring the surrounding labels), should be just instr*.

So if the example

try (throw $exn) catch $exn (try rethrow catch $exn instr* end) end

should reduce to or be observably equivalent to throw $exn instead of instr*, then rethrow has different operational semantics now as it did in the second proposal.

Are the reductions above correct? Are the semantics of rethrow that much different now as they were in the 2nd proposal?

rossberg commented 4 years ago

@ioannad, you are right, I was confused. The inner catch around the rethrow should handle it. Everything else doesn't make sense, neither in terms of (resumable) exception handling, nor more generally in terms of effect handlers.

An even more interesting case is unwind:

try (throw $exn) catch $exn (try ...rethrow... unwind instr* end) end

Any coherent semantics would require that this adds instr* to the unwind stack. But I'm not sure how exactly that would be implemented. But then, I generally don't understand how the current proposal is implementable without fragmented stack frames, so something has to give. (Edit: Oops, that was missing some context. Please assume "...in the presence of two phase EH" wrt the preceding two sentences.)

RossTate commented 4 years ago

@rossberg It sounds like you know of a paper with such a construct for effect handlers. I'd be really interested to look at it if you point me to it. (Also, resumable exceptions are incompatible with this given that catch/catch_all code is generated with the expectation that all unwinders have been run.)

If we're going with this semantics, then we should revisit #126, as one of the key arguments made there is not valid with this semantics.

I'll note that none of this complexity is necessary for the pressing use cases at hand, all of which are addressed by having end for catch/catch_all rethrow the exception. Then the choice about indexing rethrow is moot, the spec is much simpler, implementations are probably easier, and we avoid introducing control constructs we do not really understand the implications of.

aheejin commented 4 years ago

@rossberg

An even more interesting case is unwind:
try (throw $exn) catch $exn (try ...rethrow... unwind instr* end) end
Any coherent semantics would require that this adds instr* to the unwind stack. But I'm not sure how exactly that would be implemented. But then, I generally don't understand how the current proposal is implementable without fragmented stack frames, so something has to give. (Edit: Oops, that was missing some context. Please assume "...in the presence of two phase EH" wrt the preceding two sentences.)

Do you mean what if instr* within unwind again throws or rethrows, in the presence of two-phase unwinding? (And I guess your argument still applies when it's not a rethrow but a throw, right?)

If that's what you mean, I think we should either prevent control flow within unwind blocks, or end the unwinding process (the second phase) once we encounter a control flow instruction that escapes the unwind block. This includes br/br_if that target outside unwind, return, throw and rethrow. I don't think we should maintain a fragmented stack. Would this semantics solve the problem?

aheejin commented 4 years ago

@RossTate

If we're going with this semantics, then we should revisit #126, as one of the key arguments made there is not valid with this semantics.

I'll note that none of this complexity is necessary for the pressing use cases at hand, all of which are addressed by having end for catch/catch_all rethrow the exception. Then the choice about indexing rethrow is moot, the spec is much simpler, implementations are probably easier, and we avoid introducing control constructs we do not really understand the implications of.

I believe what @rossberg was talking about is not a problem specific to rethrow. throw has the same issue too. And I think we can solve it by stopping the unwind process. That's what you've been suggesting we should do for br and return too. Not sure why throw or rethrow can't do the same thing.

Also I already posted a number of comments on why it's extremely hard to remove rethrow, including but not limited to: https://github.com/WebAssembly/exception-handling/issues/127#issuecomment-694686080 https://github.com/WebAssembly/exception-handling/issues/127#issuecomment-695081335 https://github.com/WebAssembly/exception-handling/issues/127#issuecomment-695750173 But I really hope we don't derail this issue with removing rethrow thing again, and keep discussing about the instruction semantics.

RossTate commented 4 years ago

The suggestion I made does not remove the rethrow functionality; it offers a different way of providing that functionality that still serves all the use cases you referenced while also addressing a number of other issues. It simplifies the spec for @ioannad; eliminates the issue @rossberg raised here; it addresses my own concerns; and it makes the design of catch more consistent with that of unwind (since the latter rethrows the exception at the end).

As for the discussion in #126, in that discussion we did not yet have a formal semantics for rethrow and so the discussion operated based on your comments. One key comment you made was the top half of this one, from which point much of the discussion proceeded with the understanding that an indexed rethrow could be implemented by a branch and unindexed rethrow. That property does not hold for the semantics @ioannad has developed, and so much of the discussion in #126 was misinformed and needs to be redone. However, this property does hold for the semantics I gave about and for the rethrow-at-end suggestion I made, which means my suggestion avoids the need to rediscuss #126.

I know y'all need to get this shipped quickly. I believe my suggestion should help with that by simplifying the proposal while still serving the established use cases.

aheejin commented 4 years ago

@RossTate

Not sure what you are talking about "the semantics @ioannad has developed".

try (throw $exn) catch $exn (try rethrow catch $exn instr* end) end

should be instr*, and I think that's what @ioannad agreed, no? (She asked it becomes different from 2nd proposal if the result of that is throw instr*, and @rossberg confirmed it is indeed instr*, and the semantics is not different from the 2nd proposal.)

Also, I don't know why all discussions we are having here can be the reason to remove rethrow.

I don't think making end to catch rethrow the exception is the way to solve that problem. Based on what you're saying, then either end or rethrow is just matter of syntax. And it is different from the case of unwind. There, end will become resuming, not rethrowing, after two-phase unwinding arrives. But end to catch can't be resuming. And almost all use cases of unwind is to resume at the end, while all catchs use cases are fallthrough at the end. If we make end of catch a rethrow, We are going to insert a branch to every single catch to bypass it. Not sure why we should do something like that.

And as I said, this is not an issue for rethrow removal in the first place. I asked you the same thing in several different places already, but I don't think I was heard. You are effectively turning all issues into your issues to discuss your agendas. I really hope we can make each issue kept to its topic.

RossTate commented 4 years ago

Based on what you're saying, then either end or rethrow is just matter of syntax.

That is not what I am saying. I thought that was your intent, and so my first post here was trying to make sure that the semantics for rethrow matched that intent. However, the semantics you requested here does not have that property. As such, we should probably revisit #126, as under the semantics here it is not the case that an indexed rethrow can be implemented with a branch and then unindexed rethrow, so we should probably make rethrow indexed.

aheejin commented 4 years ago

@RossTate

I thought that was your intent, and so my first post here was trying to make sure that the semantics for rethrow matched that intent. However, the semantics you requested here does not have that property. As such, we should probably revisit #126, as under the semantics here it is not the case that an indexed rethrow can be implemented with a branch and then unindexed rethrow, so we should probably make rethrow indexed.

I don't have any problem making rethrow indexed, and that's what I was suggesting in #126 in the first place. But what you said in this post was that we should remove rethrow and make end at the end of catch perform something like rethrow.

And I'm not sure why you think the semantics here is different so rethrow cannot be implemented with a branch and an unindexed rethrow anyway. Can you clarify?

RossTate commented 4 years ago

Sure thing! Here's an example of an indexed rethrow that cannot be implemented with only an unindexed rethrow:

try
  ...
catch_all $outer_catch
  try
    ...
  catch_all $inner_catch
    try
      rethrow $outer_catch
    catch $exn
      rethrow $inner_catch
    end
  end
end

tlively commented 4 years ago

Just to check my understanding, if both of those ...s (call them instrOuter* and instrInner*) throw, then this whole snippet ends up propagating the inner exception if the outer exception has tag $exn and otherwise ends up propagating the outer exception? With some code duplication of instrInner*, I believe this has the same semantics:

try $outer
  instrOuter*
catch $exn
  instrInner*
catch_all
  try $inner
    instrInner*
  catch_all
    br $inner
  end
  rethrow
end

But yeah, having indexed rethrow certainly makes this simpler.

RossTate commented 4 years ago

Heheh, I originally had a more complex example to make that rewriting less practical 😄 And I wonder if it's always possible (with some code duplication, as in your example, not with just a branch) given the current control constructs we have. But with extensions like two-phase exception handling, if we provide a try with a phase-one block (in addition to the current phase-two block), then you could put an (indexed) rethrow in the phase-one block. Then there would be no way to know the contents of the stack in order to do this rewrite. (The stack might even contain frames from other applications.)

rossberg commented 4 years ago

@rossberg
An even more interesting case is unwind:
try (throw $exn) catch $exn (try ...rethrow... unwind instr* end) end
Any coherent semantics would require that this adds instr* to the unwind stack. But I'm not sure how exactly that would be implemented. But then, I generally don't understand how the current proposal is implementable without fragmented stack frames, so something has to give. (Edit: Oops, that was missing some context. Please assume "...in the presence of two phase EH" wrt the preceding two sentences.)
Do you mean what if instr* within unwind again throws or rethrows, in the presence of two-phase unwinding? (And I guess your argument still applies when it's not a rethrow but a throw, right?)

What I wondered is how the unwind phase will know how to take the same path if it's executed in a separate phase. I really don't understand the vision here, or what separating the unwind instruction attempts to achieve.

If that's what you mean, I think we should either prevent control flow within unwind blocks, or end the unwinding process (the second phase) once we encounter a control flow instruction that escapes the unwind block. This includes br/br_if that target outside unwind, return, throw and rethrow. I don't think we should maintain a fragmented stack. Would this semantics solve the problem?

My wider concern is: it seems that the goal for two-phase unwind is that it can work with a single stack, and there seems to be some assumption that that merely requires executing catch blocks in lower stack frames before unwinding the stack. But I have no idea how this can fly in many engines, since they often assume that they can push something onto the stack within the current function -- e.g., inner try handlers, temporary spills, function arguments, etc. That won't be possible in that scenario, so stack fragmentation seems inevitable. It's not enough to just prevent control flow. Nor is that an option, I think (you certainly cannot prevent some random callee from throwing).

aheejin commented 4 years ago

But I have no idea how this can fly in many engines, since they often assume that they can push something onto the stack > within the current function -- e.g., inner try handlers, temporary spills, function arguments, etc. That won't be possible in that scenario, so stack fragmentation seems inevitable. It's not enough to just prevent control flow. Nor is that an option, I think (you certainly cannot prevent some random callee from throwing).

@rossberg @ioannad Do Java or Lisp also use two-phase unwinding for exceptions? If so, how do they handle these in their finally or unwind-protect? I'm searching for it but it's hard to find. Also, is unwind-protect the same as finally? By looking at this it sounds like finally, but I'm not familiar with Lisp.

rossberg commented 4 years ago

Java and the JVM do not have two-phase unwind. Even if an engine implements it that way internally, it is not programmatically observable and no user code can be executed as part of the first phase. Same as in C++. So no such problems arise. That's very different from what seems to be envisioned here, and why I find that pretty scary.

The only runtime I know that has a mechanism like that would be the CLI (.NET) with its filter blocks. But the CLI is a super-complicated beast, and happily requires an implementation that is an order of magnitude more complex than Wasm.

Lisp is difficult to compare, since it often uses widely different implementation techniques that heavily lean on closures and GC, e.g. CPS. (Yes, unwind-protect is more or less finally.)

phoe commented 4 years ago

@aheejin The Common Lisp condition system has only one operator for executing cleanup forms, named unwind-protect - it is equivalent to Java's finally.

However, the Common Lisp condition system works on a completely different basis than e.g. Java exception handling, with the most important difference being: when a Common Lisp condition (this includes errors) is signaled, the stack is wound further instead of being unwound. Only then a transfer of control may (or, in case of error conditions, must) happen.

Under the hood, the Common Lisp condition system is implemented in Common Lisp itself (see example here) on top of its primitive control-flow operators - tagbody/go, block/return-from, and catch/throw.

go is the "goto" operator that jumps to a label defined in a tagbody;
return-from returns a value from a matching block in lexical scope;
throw returns a value from a matching catch in dynamic scope.

All of those operators (just like normal returns) must respect all unwind-protect blocks that they unwind through.

For more detailed information about the CL standard, we could consult CLHS 5.2 Transfer of Control to an Exit Point. Let's copy some text from there:

When a transfer of control is initiated by go, return-from, or throw the following events occur in order to accomplish the transfer of control. Note that for go, the exit point is the form within the tagbody that is being executed at the time the go is performed; for return-from, the exit point is the corresponding block form; and for throw, the exit point is the corresponding catch form.

Intervening exit points are "abandoned" (i.e., their extent ends and it is no longer valid to attempt to transfer control through them).

The cleanup clauses of any intervening unwind-protect clauses are evaluated.

Intervening dynamic bindings of special variables, catch tags, condition handlers, and restarts are undone.

The extent of the exit point being invoked ends, and control is passed to the target.

As long as we have dynamic/fluid variables (which, in the worst case, we can implement ourselves on top of lexical variables and a primitive unwind operator), we can resolve point 3 on top of WASM.
We do not need to care about point 1, since it is undefined behavior in Common Lisp, and so the burden is on the programmer.
This leaves point 2, which is required to perform nested cleanup forms (think nested unwind-protect, which is very common in practice) and point 4, which is the primitive jump operator that we've already used in point 3.

If you need any further Lisp-related help, I'll be available to answer. (Shameless plug: I've described this mechanism in detail in an upcoming book, The Common Lisp Condition System, that will be released soon; I hope that my knowledge in the matter is of help in this issue.)

aheejin commented 4 years ago

@rossberg My understanding and sketchy plans for future two-phase unwinding are mostly centered on C++-like approach. C++ does not let users to insert user-custom filters, but filters functions are inserted from the library side. C++'s filter functions have to return one of three results: 1. This exception should be caught here 2. This exception should not be caught here 3. This is a cleanup code. I think catch instruction can be extended to take an additional filter function like that.

try-unwind was taken from @RossTate's ideas, and I thought it as a separation of case 3 from the filter function case, so future filter functions for catch now only need to return either 1 or 2 (my exception or not my exception).

If you think try-unwind is risky, do you think using catch for cleanup code as we have done so far is safer? If so, why? Or do you happen to have another recommendation on constructs for handling cleanup code in future two-phase unwinding other than catch or unwind?

ioannad commented 4 years ago

@aheejin, I am not familiar with Java's finally, so I showed your question to the #lisp freenode channel and asked for opinions on Java's finally vs common-lisp's unwind-protect. (Thank you @phoe for the detailed explanation!) This is the reason I wrote the execution step for try-unwind as I did originally (changes still pending), although for it to really be a proper primitive for unwind-protect (or finally) it should also run the cleanup forms when its try-code exits via br, return, or trap.

But as you said, you did not intend try-unwind as a primitive for finally, although I think @RossTate probably did.

There is another possible misunderstanding I see in the comments though. The "cleanup forms" as I understood them (and as they are meant in common-lisp for example), are standard language instructions, not destructors in the implementation language. This also confused me when I asked the destructors question on your PR #137. I find it surprising to describe a language using constructs of another language. Has this been done in WebAssembly before?

phoe commented 4 years ago

The "cleanup forms" as I understood them (and as they are meant in common-lisp for example), are standard language instructions, not destructors in the implementation language.

To the best of my knowledge, C++ destructors are normal methods, as in, they contain standard language instructions; this is the same as Java's finally blocks or cleanup forms in Lisp's unwind-protect.

If I understand correctly, the only difference in semantics is that destructors need to track if they are being executed in a normal control flow (where throwing an exception invokes normal handling mechanisms) or in an abnormal control flow (where throwing an exception must std::terminate the whole program). This is a detail specific to C++ and does not apply to implementing Common Lisp.

ioannad commented 4 years ago

@phoe, thanks for the clarification wrt C++ exceptions.

What I meant is that I find surprising to specify the Wasm instruction try-unwind as containing C++ language instructions (destructors).

rossberg commented 4 years ago

@rossberg My understanding and sketchy plans for future two-phase unwinding are mostly centered on C++-like approach. C++ does not let users to insert user-custom filters, but filters functions are inserted from the library side. C++'s filter functions have to return one of three results: 1. This exception should be caught here 2. This exception should not be caught here 3. This is a cleanup code. I think catch instruction can be extended to take an additional filter function like that.

Okay, thanks for the clarification. Yes, I can see how filters could work when specified as separate functions, along those lines we discussed some time ago.

try-unwind was taken from @RossTate's ideas, and I thought it as a separation of case 3 from the filter function case, so future filter functions for catch now only need to return either 1 or 2 (my exception or not my exception).

If you think try-unwind is risky, do you think using catch for cleanup code as we have done so far is safer? If so, why? Or do you happen to have another recommendation on constructs for handling cleanup code in future two-phase unwinding other than catch or unwind?

My take on try-unwind as described right now (please correct me if I'm wrong) is the following:

(try A unwind B) is equivalent to (try A catch_all B rethrow) (and probably implemented like that).
If this equivalence holds now, then we cannot break it later.
Consequently, try-unwind is operationally redundant right now, and will remain so in the future.
Consequently, it can neither be useful nor required for later extensions.

Adding filters to catch won't change the above equivalence with catch_all. And vice versa, the absence or presence of a separate try-unwind construct implies nothing for catch filters. In particular, they ought to return a Boolean regardless. Concretely, I would expect that filters would simply be introduced as a new, third form of catch clause, catch_when or something.

My guess is that I am missing some external benefit of distinguishing try-unwind. In tooling, debuggers perhaps? But does that need to be served by a language construct?

FWIW, I could see a minor hypothetical value in a proper try-finally, because that at least avoids some additional branching gymnastics in producers. But ultimately that would just move these branching gymnastics into consumers, so philosophically is not a good fit for Wasm. Which is why we originally rejected including it as a primitive.

tlively commented 4 years ago

My understanding is that unwind is explicitly meant to be a forwards-compatibility feature with documentation of how its semantics are expected to change in the future. This is certainly a break from WebAssembly's traditional stance on backwards-compatibility, so if the CG is uncomfortable adding such forward-compatibility features, we should drop unwind from the proposal and emit catch_all in the meantime. However, that would be a shame because it would require an ABI break between single-phase and two-phase exception handling in C++.

RossTate commented 4 years ago

unwind blocks are run whenever the stack is unwound. That can happen for reasons besides exceptions. I believe @aheejin intends catch_all blocks to be specifically for exceptions. At present, though, the only way to cause unwinding is exceptions, and hence we cannot see the difference. (Sorry for the short post; short on time.)

phoe commented 4 years ago

At present, though, the only way to cause unwinding is exceptions

This will cause an issue with the way Common Lisp works. It has operators for non-local jumps that can explicitly unwind the stack without throwing an exception. (I can explain more and provide code examples if it's required, but I'll abstain for the time being to avoid verbosity.)

While it is theoretically possible to implement most of go, return-from, and throw via some means of inspecting the stack (e.g. dynamic/fluid variables) throwing exceptions, such an implementation would be inefficient and, IMO, a hack.

rossberg commented 4 years ago

@RossTate, what other causes do you have in mind? How many non-local continuation barrels do you want to make primitive in Wasm? I'd argue that we're doing it wrong if we need more than the current two.

RossTate commented 4 years ago

@phoe Thanks for bringing that up. See #124 for ideas on how to make that possible. (Also, thanks for your great write-up on Common Lisp!)

@rossberg One example is two-phase exception handling. In the first phase, you search for a target and then you unwind the stack. That means that, by the time you unwind the stack, there is no exception anymore. In that unwinding phase, you just want to run unwind clauses. (Also, you seem to be aware of some catch_all+rethrow construct in the effect-handler literature. Would you mind pointing me to it? I couldn't find one myself.)

aheejin commented 4 years ago

@phoe Thank you for the detailed writeup. I'm sorry but I don't have any Lisp background, so my understanding of it can be very limited and I might ask some stupid questions.

First of all, I don't think I understood the gist of your writeup very well. Are you suggesting the proposal should support Lisp EH and describing ways that wasm EH can do that? If so, that... can be something good to support, but this is really what you are suggesting, I think it can be discussed as a separate issue, because I don't think we have concretely thought about Lisp as a potential target so far for this proposal, and it is a lot bigger issue than the try-unwind problem we were discussing here.

Also I really didn't understand much about your 4 bullet points in your writeup; I was having hard time understanding what problem you are trying to solve in the first place. (I was wondering, like, is that how wasm EH should support Lisp? Or is this an answer to my question?..)

What I asked in this question was, because @rossberg said

But I have no idea how this can fly in many engines, since they often assume that they can push something onto the stack > within the current function -- e.g., inner try handlers, temporary spills, function arguments, etc. That won't be possible in that scenario, so stack fragmentation seems inevitable. It's not enough to just prevent control flow. Nor is that an option, I think (you certainly cannot prevent some random callee from throwing).

I was wondering if Lisp support both two-phase unwinding and unwind-protect, how it deals with what @rossberg mentioned: "inner try handlers, temporary spills, function arguments, etc". Can you help on this point?

phoe commented 4 years ago

I'm sorry but I don't have any Lisp background, so my understanding of it can be very limited and I might ask some stupid questions.

No problem, I am trying to support this.

Are you suggesting the proposal should support Lisp EH and describing ways that wasm EH can do that?

If by EH you mean "exception handling", then the Lisp EH doesn't exist as a language primitive the way it exists e.g. in Java or C++ or Python. This is the point that is very often considered unusual and/or misunderstood - the CL condition system is, most of the time, written in Lisp itself, and is based on control flow primitives (unwind-protect and the aforementioned operators that perform unwinding).

The most important implication that I'm trying to convey is: if we can support just tagbody, block, catch, and unwind-protect, then we can build a whole CL condition system (and therefore all error-handling facilities) on top of them.

Because of this fact, the second part of the question:

I was wondering if Lisp support both two-phase unwinding and unwind-protect, how it deals with what @rossberg mentioned: "inner try handlers, temporary spills, function arguments, etc". Can you help on this point?

becomes moot. This is because the error-handling facility itself can be (and most of the time, is) written in Common Lisp itself, depending only on control flow primitives. This way, it does not need to be specially handled by WASM itself.

As for the first part, I think that two-phase unwinding is required for two reasons:

return-from/go should be able to signal errors if they are trying to return from outside the dynamic scope of their matching block/return-from instead of unwinding all the way through,
catch/throw must be able to inspect the stack to find the proper tag to return to at runtime.

(This depends on whether I have understood the first post in #124 correctly).

Also I really didn't understand much about your 4 bullet points in your writeup;

These four bullets is a copied part of the Lisp standard that describes how non-local jumps are performed to conform with the language semantics; sorry if it introduced confusion.

Are you suggesting the proposal should support Lisp EH and describing ways that wasm EH can do that?

I'm suggesting that the proposal should support the ways in which Lisp performs its control flow; Lisp EH will happen as a natural implication of that.

I think it can be discussed as a separate issue,

OK - I apologize. I can move it to a separate issue and move my writeup there if that's the best way forwards.

aheejin commented 4 years ago

@phoe Thank you for your reply.

The most important implication that I'm trying to convey is: if we can support just tagbody, block, catch, and unwind-protect, then we can build a whole CL condition system (and therefore all error-handling facilities) on top of them.

I'm suggesting that the proposal should support the ways in which Lisp performs its control flow; Lisp EH will happen as a natural implication of that.

While Lisp support can be one thing we can discuss, I have to be cautious that, changing the structure of the current EH proposal drastically using Lisp-like primitives can be hard. This proposal is already at phase 2 in the standard process, and while the phase 2 only requires the English spec text, this proposal has progressed more than that so far; our C++ toolchain had a working implementation, and one VM (V8) had a fully working implementation and another (Firefox) AFAIK a near-working implementation. We had to make some change to the spec recently, but it was not as big change as redesigning the whole spec.

Because of this fact, the second part of the question:

I was wondering if Lisp support both two-phase unwinding and unwind-protect, how it deals with what @rossberg mentioned: "inner try handlers, temporary spills, function arguments, etc". Can you help on this point?

becomes moot. This is because the error-handling facility itself can be (and most of the time, is) written in Common Lisp itself, depending only on control flow primitives. This way, it does not need to be specially handled by WASM itself.

I'm still having a hard time understanding this. Does Lisp have two-phase in the first place? If so, does this mean Lisp doesn't need to worry about stack modification happening in the second phase, because of some of Lisp's characteristics? Then this does not solve our non-Lisp problem anyway, right?

(This depends on whether I have understood the first post in #124 correctly).

FYI, #124 is not a spec; it is a suggestion from @RossTate, and I don't think we are discussing something specific to his suggestion in this post.

OK - I apologize. I can move it to a separate issue and move my writeup there if that's the best way forwards.

I think this issue has already contains too many things, so I think having one issue post for each issue is better. But still, as I said, it may not be easy at this stage to accommodate requests for fundamental changes.

(@aheejin I think the above comment needs an edit/delete; it is a big quote of my original post but contains no text of your own.)

Yes I clicked the comment button accidentally and deleted the comment shortly after.

aheejin commented 4 years ago

@phoe

After reading your comments again, I think some of my comments possibly have caused some misunderstanding. When I asked what Lisp does in this case, I was not asking how we can add support for Lisp in this proposal; I was asking, if try-unwind has a problem as @rossberg suggested, how other languages that have both two-phase and a finally-like feature are solving this problem. This doesn't contain any opinions on whether we should support Lisp itself or not, but I was not asking about Lisp support itself. But I think my question might have possibly looked like I was asking how we can support Lisp.

aheejin commented 4 years ago

@rossberg

So, what I tried to explain is, catch_all and unwind are the same in the current single-phase proposal, but will be different from the future two-phase proposal. I'm not sure what your suggestion for handling cleanup code is. I don't think unwind vs. finally is an issue here; finally will have the same problem you mentioned, if we do various additional things from finally, right? (Also C++ toolchain can't use finally for cleanup code, because LLVM compiles the normal and the exception path separately)

In my understanding, in the future two-phase proposal,

The first phase should search up the stack to find a catching catch, by matching the tag and running a filter function attached to each catch (It can be a separate catch_when as you said, but it's not the point here). In this phase, if you encounter a cleanup code block, you should skip it. (This is unwind now.)

2-a. If the first phase didn't find a matching catch, the program crashes without unwinding the stack. Done.

2-b. If the first phase found a matching catch, we should arrive there while unwinding the stack. But in the way to there, we should run any intervening cleanup code in between.

So this is the reason why I think we need a way to distinguish user catch handler and cleanup code when two-phase arrives. I thought about two ways to do it:

Using only catch/catch_all (or other catch variant you suggest, doesn't matter here), and making a filter function return one of three values: a. I am a catch handler and this exception is mine, b. I am a catch handler and this exception isn't mine, c. I am a cleanup code. In this case we need another instruction, such as resume, to resume the second phase after running cleanup code. (rethrow is different; it initiates a full two-phase search. In the current single-phase proposal it doesn't matter though.)
Adding try-unwind and making it take care of cleanup code. In this case we don't need resume; end at the end of unwind does it. And catch's filter functions now only need to return a boolean.

I didn't have specific preferences on 1 or 2, and @RossTate preferred 2, so we chose it.

Do you think 1 is better? It'd be better to understand what your suggestion for handling cleanup code, whether it is one of these or not. Your previous answer didn't contain it, and knowing it will help to close the gap in our understanding.

aheejin commented 4 years ago

@ioannad

But as you said, you did not intend try-unwind as a primitive for finally, although I think @RossTate probably did.

Yes it looks like it, but that assumes a new instruction he proposed, something called unwinding branch. I was arguing, if he wants finally, adding finally directly would be more intuitive than adding his "unwind branch" that changes the semantics of unwind. (I'm not arguing we should add finally; I was arguing if someone really needs finally functionality, adding it directly might be better. But this is not the point of this comment anyway)

There is another possible misunderstanding I see in the comments though. The "cleanup forms" as I understood them (and as they are meant in common-lisp for example), are standard language instructions, not destructors in the implementation language. This also confused me when I asked the destructors question on your PR #137. I find it surprising to describe a language using constructs of another language. Has this been done in WebAssembly before?

In format spec no. In the explainer, I was trying to give an example of cleanup code. But if people think it is confusing, I can remove that word.

Given that unwind-protect is the same as finally, I think the difference between that and our unwind is, finally(or unwind-protect) runs regardless of whether we exit by an exception or a normal control flow (such as branch or return), where unwind runs only for the exceptional exit path. Also, I'm not very sure what the usual interaction between two-phase unwinding and finally(or unwind-protect). My assumption was finally like structures not visited in the first phase, and they are only visited in the second phase.

Do you think this has a problem? Different people here are all talking about different points, so it's been confusing and it was hard to understand what exactly the problem related to the current spec is, if any. Also it's possible that I might be lacking some understanding of everything you mentioned because I'm not very familiar with the formal notations.

ioannad commented 4 years ago

But as you said, you did not intend try-unwind as a primitive for finally, although I think @RossTate probably did.

Yes it looks like it, but that assumes a new instruction he proposed, something called unwinding branch. I was arguing, if he wants finally, adding finally directly would be more intuitive than adding his "unwind branch" that changes the semantics of unwind. (I'm not arguing we should add finally; I was arguing if someone really needs finally functionality, adding it directly might be better. But this is not the point of this comment anyway)

Thank you for the clarification, @aheejin, I am having a hard time understanding the discussion in #124. I will direct any further comments on finally/unwind in that issue instead.

Different people here are all talking about different points, so it's been confusing and it was hard to understand what exactly the problem related to the current spec is, if any.

I agree that mixing up the issues is very confusing. I'll post any further questions in their respective issues, and leave this issue for clarifications on the formal notation.