An alternate model for exceptions

WebAssembly / exception-handling

Proposal to add exception handling to WebAssembly

https://webassembly.github.io/exception-handling/

Other

159 stars 34 forks source link

An alternate model for exceptions #82

Open fgmccabe opened 5 years ago

fgmccabe commented 5 years ago

This exceptions proposal seems quite complex and overly oriented to the needs on one situation. This is a proposal for a (hopefully) simpler exception handling framework.

I propose partitioning the exception handling into two separable pieces: the modeling of exceptions and the modeling of the control flow.

Exceptions should not be special. I propose that exception values simply be any value that can be passed to a function. I.e., there would be no special 'marking' of certain values to be exception values.
Control flow. There are three 'interesting events' in the life of an exception: when the exception is first thrown, when it is caught, and when it propagates out of a function.

The most interesting case is when an exception propagates out. In this proposal, instead of having a global unwind mechanism that can unwind an arbitrary number of stack frames, I suggest having a special 'invoke' instruction that combines a normal function call with the possibility of throwing:

invoke

together with a return_throw instruction.

When an invoked function returns normally, it is as though nothing abnormal happens. But, if a return_throw instruction is invoked, then the corresponding invoke instruction also fails.

I.e., an invoke instruction behaves as though it were one of two instructions: function_call or throw; depending on whether the called function existed with a return or a return_throw instruction.

Other than that, some of the existing proposal would stay the same. In particular, the basic control flow form:

try resulttype instruction catch instruction end

would be essentially the same; although, IMO, the type of the thrown exception should also be included:

try resulttype exceptiontype instruction catch instruction end

The instructions in the so-called catch block would be responsible for decoding the exception value; which is one reason for including the type of that value in the try-block itself.

Similarly, functions that can throw should also have that reflected in their signature.

I am aware that so-called checked exceptions are a controversial topic. This proposal is oriented towards checked exceptions but some small adjustments would allow for unchecked exceptions too. There would be no intrinsic support for distinguishing between exceptions thrown in one language and caught in another. This is deliberate. Such interlanguage issues can be addressed using the forthcoming proposal for xxx-IDL bindings. There would be no intrinsic support for features such as stacktrace. This is deliberate. As far as I am aware, this proposal also represents a 'zero cost' exception handling proposal. One architectural difference is that multi-frame unwinding of exceptions is represented explicitly in the code rather than being implicit.

rossberg commented 5 years ago

I'm confused. Can you elaborate how the control flow of invoke is different from call and how return_throw differs from the throw in the proposal? The way you describe it, it sounds exactly the same.

There is a reason why we introduced nominal exception constructors: they are the only way to guard exceptions against accidental misinterpretation when multiple independent (and mutually unaware) languages or runtimes get mixed on a call chain.

Edit: Also, can you elaborate on what benefit you see for checked exceptions on this level? Especially considering that we won't be able to enforce them across languages (e.g. JS).

fgmccabe commented 5 years ago

return/return_throw is inspired by the way that Haskell functions return (by not actually constructing the value and signaling which constructor would have been used for the subsequent case analysis). The idea is that you either return normally or you return abnormally. The invoke instruction acts as an implicit rethrow if the called function returned abnormally. A key difference between this and the existing proposal is that there is no multi-frame unwind without wasm instructions intervening. I.e., if you do end up with a multi-frame unwind, at each level you will have had to 'rethrow'

The interlanguage issue is the one that gave me the most pain. In fact, if you take the one-frame-at-a-time approach, then interlanguage frames will never happen. The web/xx-IDL binding story helps here: if you include exception handling as part of the bindings of a function.

On Thu, Jun 27, 2019 at 10:50 AM Andreas Rossberg notifications@github.com wrote:

I'm confused. Can you elaborate how the control flow of invoke is different from call and how return_throw differs from the throw in the proposal? The way you describe it, it sounds exactly the same.

There is a reason why we introduced nominal exception constructors: they are the only way to guard exceptions against accidental misinterpretation when multiple independent (and mutually unaware) languages or runtimes get mixed on a call chain.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUD5GT36FCXR4KK5BJ3P4T4VJA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYX4CPQ#issuecomment-506446142, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUGUPOOZ5XNAJK2III3P4T4VJANCNFSM4H36RJBQ .

-- Francis McCabe SWE

fgmccabe commented 5 years ago

Small follow-on. I called out return_throw as a separate instruction from throw because throw requires a catch block to be statically determinable and return_throw leaves the function.

On Thu, Jun 27, 2019 at 11:43 AM Francis McCabe fgm@google.com wrote:

return/return_throw is inspired by the way that Haskell functions return (by not actually constructing the value and signaling which constructor would have been used for the subsequent case analysis). The idea is that you either return normally or you return abnormally. The invoke instruction acts as an implicit rethrow if the called function returned abnormally. A key difference between this and the existing proposal is that there is no multi-frame unwind without wasm instructions intervening. I.e., if you do end up with a multi-frame unwind, at each level you will have had to 'rethrow'

The interlanguage issue is the one that gave me the most pain. In fact, if you take the one-frame-at-a-time approach, then interlanguage frames will never happen. The web/xx-IDL binding story helps here: if you include exception handling as part of the bindings of a function.

On Thu, Jun 27, 2019 at 10:50 AM Andreas Rossberg < notifications@github.com> wrote:

I'm confused. Can you elaborate how the control flow of invoke is different from call and how return_throw differs from the throw in the proposal? The way you describe it, it sounds exactly the same.

There is a reason why we introduced nominal exception constructors: they are the only way to guard exceptions against accidental misinterpretation when multiple independent (and mutually unaware) languages or runtimes get mixed on a call chain.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUD5GT36FCXR4KK5BJ3P4T4VJA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYX4CPQ#issuecomment-506446142, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUGUPOOZ5XNAJK2III3P4T4VJANCNFSM4H36RJBQ .

-- Francis McCabe SWE

-- Francis McCabe SWE

rossberg commented 5 years ago

The idea is that you either return normally or you return abnormally.

Yes, but that's also the case for call under the current proposal (or with evaluation in general, for that matter).

The invoke instruction acts as an implicit rethrow if the called function returned abnormally.

Sorry, I still don't follow. How is that observably different from what call does? Whether you model it as a rethrow or not, the result seems to be the same: you continue to unwind the stack. Can you give a piece of code that shows the behavioural difference?

A key difference between this and the existing proposal is that there is no multi-frame unwind without wasm instructions intervening. I.e., if you do end up with a multi-frame unwind, at each level you will have had to 'rethrow'

The interlanguage issue is the one that gave me the most pain. In fact, if you take the one-frame-at-a-time approach, then interlanguage frames will never happen.

How so? What happens if one Wasm function f from language A calls into JS and JS calls into another Wasm function g from language B, and g throws?

The web/xx-IDL binding story helps here: if you include exception handling as part of the bindings of a function.

I called out return_throw as a separate instruction from throw because throw requires a catch block to be statically determinable and return_throw leaves the function.

I don't understand what you mean by that. As with all exception mechanisms, try-catch handlers are dynamically scoped and generally cannot be determined statically.

fgmccabe commented 5 years ago

The interlanguage case is dealt with by including error handling in the binding layer. That in turn implies that any exceptions that arise within an imported function (at least one with bindings support) are caught first by the binding layer itself. As part of that, exception values have to be coerced from IDL-land into WASM land - just like a normal value returned from a function

On Thu, Jun 27, 2019 at 1:19 PM Andreas Rossberg notifications@github.com wrote:

The idea is that you either return normally or you return abnormally.

Yes, but that's also the case for call under the current proposal (or with evaluation in general, for that matter).

The invoke instruction acts as an implicit rethrow if the called function returned abnormally.

Sorry, I still don't follow. How is that observably different from what call does? Whether you model it as a rethrow or not, the result seems to be the same: you continue to unwind the stack. Can you give a piece of code that shows the behavioural difference?

A key difference between this and the existing proposal is that there is no multi-frame unwind without wasm instructions intervening. I.e., if you do end up with a multi-frame unwind, at each level you will have had to 'rethrow'

The interlanguage issue is the one that gave me the most pain. In fact, if you take the one-frame-at-a-time approach, then interlanguage frames will never happen.

How so? What happens if one Wasm function f from language A calls into JS and JS calls into another Wasm function g from language B, and g throws?

The web/xx-IDL binding story helps here: if you include exception handling as part of the bindings of a function.

I called out return_throw as a separate instruction from throw because throw requires a catch block to be statically determinable and return_throw leaves the function.

I don't understand what you mean by that. As with all exception mechanisms, try-catch handlers are dynamically scoped and generally cannot be determined statically.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUEYKAUTZVWSMAPXOBDP4UODVA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYYICBI#issuecomment-506495237, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUEZ6YXQZK2AWFYDCEDP4UODVANCNFSM4H36RJBQ .

-- Francis McCabe SWE

fgmccabe commented 5 years ago

Follow on note: with this proposal, the determination of the 'catcher' for any exception is statically determined. Not sure how to make that clearer.

On Thu, Jun 27, 2019 at 1:58 PM Francis McCabe fgm@google.com wrote:

The interlanguage case is dealt with by including error handling in the binding layer. That in turn implies that any exceptions that arise within an imported function (at least one with bindings support) are caught first by the binding layer itself. As part of that, exception values have to be coerced from IDL-land into WASM land - just like a normal value returned from a function

On Thu, Jun 27, 2019 at 1:19 PM Andreas Rossberg notifications@github.com wrote:

The idea is that you either return normally or you return abnormally.

Yes, but that's also the case for call under the current proposal (or with evaluation in general, for that matter).

The invoke instruction acts as an implicit rethrow if the called function returned abnormally.

Sorry, I still don't follow. How is that observably different from what call does? Whether you model it as a rethrow or not, the result seems to be the same: you continue to unwind the stack. Can you give a piece of code that shows the behavioural difference?

A key difference between this and the existing proposal is that there is no multi-frame unwind without wasm instructions intervening. I.e., if you do end up with a multi-frame unwind, at each level you will have had to 'rethrow'

The interlanguage issue is the one that gave me the most pain. In fact, if you take the one-frame-at-a-time approach, then interlanguage frames will never happen.

How so? What happens if one Wasm function f from language A calls into JS and JS calls into another Wasm function g from language B, and g throws?

The web/xx-IDL binding story helps here: if you include exception handling as part of the bindings of a function.

I called out return_throw as a separate instruction from throw because throw requires a catch block to be statically determinable and return_throw leaves the function.

I don't understand what you mean by that. As with all exception mechanisms, try-catch handlers are dynamically scoped and generally cannot be determined statically.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUEYKAUTZVWSMAPXOBDP4UODVA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYYICBI#issuecomment-506495237, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUEZ6YXQZK2AWFYDCEDP4UODVANCNFSM4H36RJBQ .

-- Francis McCabe SWE

-- Francis McCabe SWE

aheejin commented 5 years ago

I think multi-level unwinding support is good and well suited for languages we currently try to support. It may not very well suited for Haskell, but in case we decide to support it in the future, it does not have to use the current EH proposal. I think generating code to rethrow again at every level is more code size and unnecessary complication.
I am really inclined not to have exception signatures (or checked exceptions) in function signatures and try signatures. Transitively scanning all callees to find out all possible exception signatures is just not feasible. That applies to bindings too. But we need to talk more about the bindings story in the future, since much of the binding spec is still up in air. But I don't think the MVP EH spec should include interactions with bindings.
Without except_ref and tags, we can't carry info like where this exception was originated from or other possibly helpful info like backtraces. You said offline it is not necessary if we disallow multi-level unwind, but I don't get it. How are you gonna transfer that information from the current function to a caller? (Nevermind bindings here; in normal wasm-only call stack)

fgmccabe commented 5 years ago

Thank you for your response. Specific points:

The 'static' proposal does not prevent multi-level unwinding. It simply requires code at each level. (Haskell's approach to exceptions is radically different to either the current EH proposal or my amendment of it). It seems to me to be a matter of balance and opinion as to whether the extra code is burdensome or not.
Why do we require type signatures for function arguments and return types but not for any exceptions that they may throw? This is not a logical position IMO.
Support for backtraces is expensive and problematic for security. Furthermore, we should avoid baking features whose purpose is to support debugging into the overall design. For example, why not have line number information in executable code? (Answer, we don't; but we support source code maps for debugging. A similar approach could be taken for backtraces in exceptions)
Multi-language support. There is a difference between allowing arbitrary languages to compile to wasm, and supporting multi-language applications. The latter is very difficult to do in general (for large and for small reasons). I do not recall seeing multi-language applications as being a design focus for wasm.

On Fri, Jun 28, 2019 at 1:41 PM Heejin Ahn notifications@github.com wrote:

1.

I think multi-level unwinding support is good and well suited for languages we currently try to support. It may not very well suited for Haskell, but in case we decide to support it in the future, it does not have to use the current EH proposal. I think generating code to rethrow again at every level is more code size and unnecessary complication. 2.

I am really inclined not to have exception signatures (or checked exceptions) in function signatures and try signatures. Transitively scanning all callees to find out all possible exception signatures is just not feasible. That applies to bindings too. But we need to talk more about the bindings story in the future, since much of the binding spec is still up in air. But I don't think the MVP EH spec should include interactions with bindings. 3.

Without except_ref and tags, we can't carry info like where this exception was originated from or other possibly helpful info like backtraces. You said offline it is not necessary if we disallow multi-level unwind, but I don't get it. How are you gonna transfer that information from the current function to a caller? (Nevermind bindings here; in normal wasm-only call stack)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUEU2ZAY72VJS5ANHRLP4ZZOVA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY3DULA#issuecomment-506870316, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUG5RM3UCYM4KM27KRDP4ZZOVANCNFSM4H36RJBQ .

-- Francis McCabe SWE

aheejin commented 5 years ago

Thank you for your response. Specific points:

The 'static' proposal does not prevent multi-level unwinding. It simply requires code at each level. (Haskell's approach to exceptions is radically different to either the current EH proposal or my amendment of it). It seems to me to be a matter of balance and opinion as to whether the extra code is burdensome or not.

Yes, but I don't see reasons why we should change the proposal from ground up. The current model is suitable for the languages we are currently trying to support, incurs less code size overhead, and v8 and the toolchain already implemented most of it.

Why do we require type signatures for function arguments and return types but not for any exceptions that they may throw? This is not a logical position IMO.

As I said, it's not feasible, or not even possible. How are you gonna gather the list of all exceptions that can be possibly thrown from a specific function across indirectly called functions? And if not for the indirect calls, if a function in module A starts to throw a new kind of exception, that means a number of functions in module B that transitively calls the changed function in module A need to recompile. Why should we do this?
And regardless of feasibility, I don't think it is necessary or useful in the first place, for the same reason I think checked exceptions are not really necessary in Java. (I don't mean to start a language practice war though.) All other languages supporting exceptions don't have checked exceptions and they are just fine.

Support for backtraces is expensive and problematic for security. Furthermore, we should avoid baking features whose purpose is to support debugging into the overall design. For example, why not have line number information in executable code? (Answer, we don't; but we support source code maps for debugging. A similar approach could be taken for backtraces in exceptions)

Backtrace support is not included in the spec and we just assume except_ref possibly contain more helpful information. And AFAIK currently debugging support is also handled by throwing an Error object from the embedder that has backtrace info.

Multi-language support. There is a difference between allowing arbitrary languages to compile to wasm, and supporting multi-language applications. The latter is very difficult to do in general (for large and for small reasons). I do not recall seeing multi-language applications as being a design focus for wasm.

Tags are useful for the former too. If C++ and Rust are both compiled to wasm modules and they interact with each other, user code would like to tell the current exceptions is originated from C++ or not. And I don't see why we should remove functionality we already have.

… On Fri, Jun 28, 2019 at 1:41 PM Heejin Ahn @.***> wrote: 1. I think multi-level unwinding support is good and well suited for languages we currently try to support. It may not very well suited for Haskell, but in case we decide to support it in the future, it does not have to use the current EH proposal. I think generating code to rethrow again at every level is more code size and unnecessary complication. 2. I am really inclined not to have exception signatures (or checked exceptions) in function signatures and try signatures. Transitively scanning all callees to find out all possible exception signatures is just not feasible. That applies to bindings too. But we need to talk more about the bindings story in the future, since much of the binding spec is still up in air. But I don't think the MVP EH spec should include interactions with bindings. 3. Without except_ref and tags, we can't carry info like where this exception was originated from or other possibly helpful info like backtraces. You said offline it is not necessary if we disallow multi-level unwind, but I don't get it. How are you gonna transfer that information from the current function to a caller? (Nevermind bindings here; in normal wasm-only call stack) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#82?email_source=notifications&email_token=AAQAXUEU2ZAY72VJS5ANHRLP4ZZOVA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY3DULA#issuecomment-506870316>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUG5RM3UCYM4KM27KRDP4ZZOVANCNFSM4H36RJBQ . -- Francis McCabe SWE

KronicDeth commented 5 years ago

Although it's not that important to the conversation, multiple times people have brought up Haskell Exception as being different. All I know about Haskell exceptions is unlike most impure code that needs to be in the IO Monad exceptions could occur anywhere without showing up in the type signature.

What specifically is it about Haskell exception that is different for the purpose of this proposal? Links to explainers, blogs, or papers showing why they are different would be useful.

tlively commented 5 years ago

I had previously assumed that Haskell exceptions were similar to Rust's Result type, but after reading a bit it seems that Haskell exceptions also use stack unwinding and would therefore be well served by the current exception proposal. https://gitlab.haskell.org/ghc/ghc/wikis/exceptions/stack-traces#producing-a-stack-trace.

rossberg commented 5 years ago

@fgmccabe:

The 'static' proposal does not prevent multi-level unwinding. It simply requires code at each level.

You still haven't responded to my earlier questions and explained what this code is, nor what happens if that code is not there. How would invoke differ from call? I'm afraid some of us still don't understand what you are actually proposing.

Why do we require type signatures for function arguments and return types but not for any exceptions that they may throw?

Because we need the types for efficient jit compilation and we can enforce them locally. Neither seems true for exception annotations. Technically, they would turn a type system into a type-and-effect system, which is adding a new dimension.

Support for backtraces is expensive and problematic for security.

As @aheejin said, the design is not baking in this feature. Backtraces are not part of, or required by, the proposal. It merely enables them on platforms that care, and many do.

Multi-language support. There is a difference between allowing arbitrary languages to compile to wasm, and supporting multi-language applications.

As mentioned before, exception constructors weren't introduced to support intentional interoperation between multiple mutually aware languages (they don't per se), they prevent unintended mis-operation between multiple mutually unaware languages. Multiple languages getting mixed is not something we can prevent, so we must assume it will happen, knowingly or unknowingly. Still each language should be able to well-behave, i.e., rely on its own abstractions.

rossberg commented 5 years ago

@KronicDeth, the main difference of Haskell exceptions is due to laziness. Conceptually, it's not that large: (1) they are imprecise, i.e., it is not defined which exception you get when an expression could throw at multiple points (that's because evaluation is pure and non-strict and has no evaluation order); (2) catching an exception is only possible in the IO monad (because exceptions are an effect, and catch makes them observable); (3) as a consequence of laziness, exceptions are deferred but "cached" in a value, so that the same exception may be observed multiple times when that value is forced.

Consider:

x = 1 + throw Overflow + throw Underflow

main = do
  print (x * 2) `catch` (\(e :: SomeException) -> print e)  -- may print either overflow or underflow
  print (x * 3) `catch` (\(e :: SomeException) -> print e)  -- will print the same again

Laziness generally is implemented in a manner completely different from ordinary eager evaluation (search for Spineless Tagless G-Machine), and that just carries over to exceptions. In particular, as the example shows, you cannot just unwind to the catch, you have to update values on the way -- despite the exception, x is only evaluated once.

titzer commented 5 years ago

I don't think checked exceptions works for languages without checked exceptions. E.g. in a language without checked exceptions, but nevertheless different exception types, compiling to wasm where all thrown exceptions must be declared would necessarily imply computing a whole-program analysis to determine thrown exceptions, based on a global call graph.

To see why, consider a function f that does an indirect call (either via a table or via a function reference). Then the exception signature of f is necessarily the union of the possible call targets from the indirect call in f. Since f itself can be called indirectly, you have a constraint system that must be solved in order to generate wasm code.

The solution might be huge, too. Up to O(n^2) in the size of the original program.

fgmccabe commented 5 years ago

This is not quite accurate. Focusing on Java, all exceptions are subclass of Throwable. If you want to model unchecked exceptions in a checked world, you report a Throwable for your exception type. Then, if you actually do want to catch a particular exception you have to catch then all and do an instanceof test for the actual exception class. But, this code is in the handler, and that is 'allowed' to be slow.

On Mon, Jul 1, 2019 at 7:59 AM Ben L. Titzer notifications@github.com wrote:

I don't think checked exceptions works for languages without checked exceptions. E.g. in a language without checked exceptions, but nevertheless different exception types, compiling to wasm where all thrown exceptions must be declared would necessarily imply computing a whole-program analysis to determine thrown exceptions, based on a global call graph.

To see why, consider a function f that does an indirect call (either via a table or via a function reference https://github.com/WebAssembly/function-references). Then the exception signature of f is necessarily the union of the possible call targets from the indirect call in f. Since f itself can be called indirectly, you have a constraint system that must be solved in order to generate wasm code.

The solution might be huge, too. Up to O(n^2) in the size of the original program.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUAIL3TFET73MFK4HM3P5ILVDA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY6MZJQ#issuecomment-507301030, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUDJL3DJ46VD6HP6B7LP5ILVDANCNFSM4H36RJBQ .

-- Francis McCabe SWE

mstarzinger commented 5 years ago

@fgmccabe:

Similarly, functions that can throw should also have that reflected in their signature.

I am aware that so-called checked exceptions are a controversial topic. This proposal is oriented towards checked exceptions but some small adjustments would allow for unchecked exceptions too.

Could you clarify whether the signature would just need to declare that it throws any exception or would the signature need to declare/enumerate the concrete exceptions (as with checked exceptions). I am assuming the later, but I wanted to double-check.

IIUC, this would require exception tags to be available during type checking. Without additional constraints, imported exceptions (just like imported types) need to be available to declare any signature of a function that can potentially throw. I don't know all the details, but the current GC proposal mentions similar issues: https://github.com/WebAssembly/gc/blob/master/proposals/gc/Overview.md#import-and-export.

Also, could you elaborate on the adjustments of how unchecked exception would look like. Wouldn't such an adjustment introduce a certain set of exceptions for which a local (per call-site) handling can not be enforced anymore?

Horcrux7 commented 5 years ago

I does not understand the sense of checked exceptions in an assembly output format. The source language can use checked exception for its syntax checking. But this is nothing for the runtime.

Also if I look into a language like Java with checked exception then it is only a compile time construct. There is no runtime validation.

titzer commented 5 years ago

Languages other than Java may not have an available supertype of all exceptions like java.lang.Throwable, so they would have to introduce one (though to be fair, in the current proposal the exception reference type introduces one).

I still think the need to do any whole program/whole module analysis in order to compute exception signatures is prohibitive in terms of producer complexity.

In terms of code size, I think requiring all functions that could have an exception thrown through them (even if just a rethrow) is also prohibitive.

As for the instanceof testing, I think the current proposal with first-class exceptions is better, since the exception can escape as a value and the dispatching logic can be factored out to common routines or blocks. Were you suggesting that every catch around invokes handle every possible exception?

fgmccabe commented 5 years ago

(I am writing another message that rephrases the proposal. This is simply in response to specific questions raised)

As far as I am aware, my proposal does not require whole module analysis. Would be interested to know why it might. For most wasm functions today, the type signature represents a very 'reduced' version of the 'actual' or 'natural' signature of the function (e.g., all pointer arguments have to be mapped to i32) The same reasoning would apply to exceptions.
The current proposal requires an internal loop that walks the stack and decides 'where to land' the exception. This loop may be central but its pretty onerous and likely to be fragile in terms of future proofing (because it's hidden from the code).
'Handling' an exception in my proposal means that the handler decides what to do with the exception. Eliminating the possibility of 'foreign' exceptions is a big plus IMO. So, yes, a catch around an invoke effectively has to intercept every exception and decide whether to handle or rethrow. But, I believe that the total number of dynamically executed host instructions will be very similar in my proposal vs the current proposal. (Hard to be precise, there is no testing for exception ABI type in my version.)

On Mon, Jul 1, 2019 at 11:54 AM Ben L. Titzer notifications@github.com wrote:

Languages other than Java may not have an available supertype of all exceptions like java.lang.Throwable, so they would have to introduce one (though to be fair, in the current proposal the exception reference type introduces one).

I still think the need to do any whole program/whole module analysis in order to compute exception signatures is prohibitive in terms of producer complexity.

In terms of code size, I think requiring all functions that could have an exception thrown through them (even if just a rethrow) is also prohibitive.

As for the instanceof testing, I think the current proposal with first-class exceptions is better, since the exception can escape as a value and the dispatching logic can be factored out to common routines or blocks. Were you suggesting that every catch around invokes handle every possible exception?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUAC2H6WSPIQGJN6I3TP5JHF5A5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY7A22A#issuecomment-507383144, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUHZG5ZQB37RD4B7GQLP5JHF5ANCNFSM4H36RJBQ .

-- Francis McCabe SWE

fgmccabe commented 5 years ago

This counts as a restatement of my proposal; hopefully clearer than before.

Exception values are just any wasm value. (Not anyref, nor any variant thereof; although in practice they nearly always will be; the specific form of an exception will be up to the language implementer.)
All code locations that can raise an exception are discoverable by inspecting the code. Furthermore, all such locations must be in scope of a try/catch block.
If a function call might throw an exception, then it 'counts as' a potential throw in the code for that function. That implies that there must be a catch block in scope for that function call. (This was my reason for proposing an invoke instruction)
Calls to imported functions, if they can throw, must be similarly in scope of a catch block.
A function signature that can throw has an additional element to it: (ta*)=>tr throws tt where ta, tr and tt are just wasm types.
The specific instructions proposed include the try-catch form, throw instruction, return_throw instruction, invoke(plus variants for returncall, indirect). (In order to avoid proliferation of instructions, a throwable prefix would count as a friendly amendment to this proposal). FAQ: a. In order to emulate long distance unwinding, you put a try-catch block just inside each function body. That catch block intercepts any exceptions arising within the function body that have not been caught, and rethrows them (this was one reason I proposed the return_throw instruction as distinct from a throw instruction.

b. In order to emulate a Java style model where you have a mix of checked and unchecked exceptions, the handler has to inspect the thrown value. Java would presumably (as it does today in the JVM) require that exceptions are a subclass of Throwable. The Java specific handler then does an analysis of whether the local handler should handle the exception and behaves appropriately.

Note that this is probably not completely optimal for Java; because the JVM does have long distance unwinding. However, the cost of that is high (a non-constant instruction + a lot of internal machinery).

c. In order to emulate a world where there is no checked exceptions is similar to #b.

d. In order to deal with issues of so-called inadvertent language interoperability, where an exception may be propagated across a language boundary, such boundaries would also have a try-catch block where exceptions thrown by language "b" are coerced into language "a" exceptions before being rethrown.

Of course, that coercion layer would have to know how exceptions are represented in the callee language; but that is no different to knowing how values are represented in that language.

d. C++ might (I am a little out of my comfort zone in talking about C++ implementation) use internal heap allocation for exception values and use i32 as the thrown type (i.e., a pointer to linear memory).

Benefits:

No special dependency on anyref etc.
Recover the previous property that all instructions have a constant running time; every throw has a statically determined catch target.
Simpler overall design

Demerits:

Some refactoring of existing implementations and tool chains
'Late in the game'
Actual long distance exception handling potentially slightly slower.
Small expansion in number of instructions in functions (~2 as far as I can tell)

On Mon, Jul 1, 2019 at 9:35 AM Michael Starzinger notifications@github.com wrote:

@fgmccabe https://github.com/fgmccabe:

Similarly, functions that can throw should also have that reflected in their signature.

I am aware that so-called checked exceptions are a controversial topic. This proposal is oriented towards checked exceptions but some small adjustments would allow for unchecked exceptions too.

Could you clarify whether the signature would just need to declare that it throws any exception or would the signature need to declare/enumerate the concrete exceptions (as with checked exceptions). I am assuming the later, but I wanted to double-check.

IIUC, this would require exception tags to be available during type checking. Without additional constraints, imported exceptions (just like imported types) need to be available to declare any signature of a function that can potentially throw. I don't know all the details, but the current GC proposal mentions similar issues: https://github.com/WebAssembly/gc/blob/master/proposals/gc/Overview.md#import-and-export .

Also, could you elaborate on the adjustments of how unchecked exception would look like. Wouldn't such an adjustment introduce a certain set of exceptions for which a local (per call-site) handling can not be enforced anymore?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUGWY6IIUIL743DWOSTP5IW6DA5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY6V52I#issuecomment-507338473, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUCDQJT7PNH3WMLXWC3P5IW6DANCNFSM4H36RJBQ .

-- Francis McCabe SWE

rossberg commented 5 years ago

Thanks for the explanation, I believe I understand your proposal better now. It seems like you are essentially describing an encoding of exceptions via a binary disjoint union type (sum type), i.e., every throwing function would return the equivalent of a (second-class) Either (T, Error) value, return is the InLeft constructor and return_throw the InRight constructor, invoke does a case distinction between the two.

This encoding is well-known in functional language circles, where you usually have disjoint union types but not necessarily exceptions. It also has well-known disadvantages:

It is significantly slower. You have a case distinction after every call that could produce an exception, i.e., most in practice. That is most definitely not "zero cost" ("long distance" throws are the norm in most languages).
It also is more code. In source languages, you typically want to abstract away the extra plumbing via a monad, but there is no obvious abstraction facility like that in Wasm.
It requires "checked exception" types and moreover, a bifurcation of the function type space into throwing and non-throwing functions. That leads to interop issues and/or API duplication for all higher-order functions, i.e., when passing funcrefs as callbacks. Unless you introduce effect polymorphism.
In the absence of a universal and extensible error type you need language-pair-specific conversions between error values, which immediately creates an N^2 interop problem. You'll almost inevitably lose information when converting an exception back and forth between different language domains, potentially breaking code. And you generally need to create costly wrapper functions when passing funcrefs from language A to the outside or another language B; these wrappers even stack up.

Those are the reasons why even languages that already have an efficient implementation of disjoint union types, such that they could readily express your proposed approach, still tend to introduce "real" exceptions as a primitive concept.

fgmccabe commented 5 years ago

Thanks for the response; however, I am not sure I follow your reasoning...

I am aware of the functional approach. However, this proposal specifically does not require the kind of analysis you mention after every call. This was the original reason for having a return_throw instruction (you return and then throw). I am not sure why we focus on providing a zero cost way of breaking normal control flow. We do not require if-then-else to be zero cost; and arguably that occurs much more frequently than exceptions.

However, in this case, if a call is not throwing, then as far as I can see, there should be no additional host instructions executed. (There might be more code generated: specifically a host call instruction might be followed by the address of the handler; there are other mechanisms too.)

It does not actually require checked exception types. That would depend on the specific language being implemented; and that is up the language implementer. It supports checked exceptions but does not require them.
I completely do not follow the n^2 issue. Again, I understand that full inter-language interoperability is not a mandate of wasm. I do not believe its fully possible without emulation; it's better not to try

The one place where we must support inter-language interop is when we 'bridge' from one language to another. In that situation, the call/return cycle must coerce values appropriately between the languages. My proposal simply adds exceptions to that list.

Let me add some 'criticisms' at this point of the existing proposal:

Because of except_ref is a host allocated entity the client language must find a way of managing it. This is actually much more difficult for C/C++ than a full GC language (I find myself in the distinctly odd situation of defending C++). Difficult, but not impossible of course. However, we are going to have to live with this complexity for decades.
The exception tag management needs to be global - as in all languages need to agree on using different tags. This is marginally acceptable for mainstream languages like C++ & Java but a real and unnecessary burden for small languages (like Star and Lobster). Furthermore someone will have to establish a register and maintain it to keep track for which tags are used by which languages.

Note that Java implementers would likely NOT use exception tags to differentiate between Java exceptions. The Java model is far too rich to be accounted for by it (for example, with class loaders, two classes that are compiled from the same source are still distinct if they were loaded by different class loaders. This is a critical aspect of implementing Java.)

The long distance unwinding represents an unbounded computation in the middle of execution: i.e., throw has an unbounded cost. True, it is typically limited by the depth of the stack; but it is entirely possible for that to be in the hundreds of frames. Furthermore, it is going to get very difficult to integrate that with any potential stack switching primitive that might be needed to support coroutining.

On Tue, Jul 2, 2019 at 12:10 AM Andreas Rossberg notifications@github.com wrote:

Thanks for the explanation, I believe I understand your proposal better now. It seems like you are essentially describing an encoding of exceptions via a binary disjoint union type (sum type), i.e., every throwing function would return the equivalent of a (second-class) Either (T, Error) value, return is the InLeft constructor and return_throw the InRight constructor, invoke does a case distinction between the two.

This encoding is well-known in functional language circles, where you usually have disjoint union types but not necessarily exceptions. It also has well-known disadvantages:

1.

It is significantly slower. You have a case distinction after every call that could produce an exception, i.e., most in practice. That is most definitely not "zero cost" ("long distance" throws are the norm in most languages). 2.

It also is more code. In source languages, you typically want to abstract away the extra plumbing via a monad, but there is no obvious abstraction facility like that in Wasm. 3.

It requires "checked exception" types and moreover, a bifurcation of the function type space into throwing and non-throwing functions. That leads to interop issues and/or API duplication for all higher-order functions, i.e., when passing funcrefs as callbacks. Unless you introduce effect polymorphism. 4.

In the absence of a universal and extensible error type you need language-pair-specific conversions between error values, which immediately creates an N^2 interop problem. You'll almost inevitably lose information when converting an exception back and forth between different language domains, potentially breaking code. And you generally need to create costly wrapper functions when passing funcrefs from language A to the outside or another language B; these wrapper even stack up.

Those are the reasons why even languages that already have an efficient implementation of disjoint union types, such that they could readily express your proposed approach, still tend to introduce "real" exceptions as a primitive concept.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUESHWXRZ6AZI76CUZDP5L5P5A5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZAJUJI#issuecomment-507550245, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUEODOFMM4E2553KSETP5L5P5ANCNFSM4H36RJBQ .

-- Francis McCabe SWE

jgravelle-google commented 5 years ago

I am not sure why we focus on providing a zero cost way of breaking normal control flow. We do not require if-then-else to be zero cost; and arguably that occurs much more frequently than exceptions

Not what "0-cost" means. 0-cost means what you say next, if we don't throw we don't pay extra. The C++ model of exceptions are that you pay epsilon more to invoke rather than call, but at the cost of 100-1000x overhead in the case where we do throw, because the assumption is that exceptions are, indeed, exceptional.

The exception tag management needs to be global - as in all languages need to agree on using different tags.

No they don't: https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md#event-index-space

The index space is scoped to a module. Meaning moduleA.events[n] != moduleB.events[n], unless there is an event import+export to map the two together. That has always been considered IIRC

Note that Java implementers would likely NOT use exception tags to differentiate between Java exceptions

Neither does C++ as far as I'm aware. The tags are used primarily as a unique identifier to distinguish between Java exceptions and C++ exceptions. Last I saw C++ exceptions are handled by a single catch clause, and then switched on in userspace. Though there was some discussion of always using catch-all, and checking whether __cxa_throw had been called. Though I don't know specifics, because I'm far enough from the implementation work that I'm sure the state of the art has moved on.

fgmccabe commented 5 years ago

Thank you for clarifying that index space issue. I had missed it.

However, if the only purpose of the exception index is to distinguish between java exceptions and C++ exceptions (say) then IMO there are better ways of handling that (sic). Specifically, you intercept the place where you go from C++ to Java and put a handler there. (The alternate model where a Java exception skips over any C++ code before relanding in Java code boggles the mind)

On Tue, Jul 2, 2019 at 10:02 AM Jacob Gravelle notifications@github.com wrote:

I am not sure why we focus on providing a zero cost way of breaking normal control flow. We do not require if-then-else to be zero cost; and arguably that occurs much more frequently than exceptions

Not what "0-cost" means. 0-cost means what you say next, if we don't throw we don't pay extra. The C++ model of exceptions are that you pay epsilon more to invoke rather than call, but at the cost of 100-1000x overhead in the case where we do throw, because the assumption is that exceptions are, indeed, exceptional.

The exception tag management needs to be global - as in all languages need to agree on using different tags.

No they don't: https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md#event-index-space

The index space is scoped to a module. Meaning moduleA.events[n] != moduleB.events[n], unless there is an event import+export to map the two together. That has always been considered IIRC

Note that Java implementers would likely NOT use exception tags to differentiate between Java exceptions

Neither does C++ as far as I'm aware. The tags are used primarily as a unique identifier to distinguish between Java exceptions and C++ exceptions. Last I saw C++ exceptions are handled by a single catch clause, and then switched on in userspace. Though there was some discussion of always using catch-all, and checking whether __cxa_throw had been called. Though I don't know specifics, because I'm far enough from the implementation work that I'm sure the state of the art has moved on.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/exception-handling/issues/82?email_source=notifications&email_token=AAQAXUHZ2NZTTH2GRMCXPP3P5OC25A5CNFSM4H36RJB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZB5UXY#issuecomment-507763295, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUGVBTM76ALECAUHDVDP5OC25ANCNFSM4H36RJBQ .

-- Francis McCabe SWE