Closed aheejin closed 3 years ago
This seems odd to me. Different languages have different semantics for throwing exceptions in general, including rethrow. It seems pretty unlikely that a single low level mechanism will be cover all the conflicting requirements. A better approach IMO would be to look at how the throw instruction would work in a rethrow situation. TLDR; this feels like the wrong question (should rethrow take an immediate).
@fgmccabe
A better approach IMO would be to look at how the throw instruction would work in a rethrow situation.
I'm not exactly sure what this means. Do you mean, can we use throw
instead for rethrowing, which in turn means "should we even keep rethrow
"?
I think having two kinds of throwing instruction, currently throw
and rethrow
, offers more options to language implementors. Some languages, such as C++, do not need that functionality, but other languages might. And both the current proposal and the newly-decided proposal have it. I didn't mean to re-discuss whether we should keep rethrow
or not here. (If someone wants to discuss it, it is a separate discussion.)
Restoring the immediate means whether we should allow rethrow
to choose an exception to rethrow other than the topmost (or the recent-most) one, and that was the focus of this issue.
It's hard to discuss tradeoffs between design alternatives for a feature without known use cases for the feature. I think what @fgmccabe is noting is that, when you consider the potential use cases, you don't find much consistency. For example, Java, Python, and C# each have different semantics for rethrowing exceptions. But maybe those are not meant to line up with the rethrow
instruction. So having concrete examples of how rethrow
would be used would help facilitate this discussion.
I read the justification for including this that @rossberg posted, and I agree that making nesting of handlers easier/possible is a natural choice from the language semantics point of view.
However I'm a bit concerned about the complication for implementing this, as I think the immediate argument for rethrow
effectively introduces an implicit exception stack that's separate from the normal value stack (the previous design with exnref
was more consistent in this regard, with everything being on the value stack). This second stack may need to be explicitly tracked by an implementation, to ensure outer exception references remain live.
Is it correct that rethrow-with-immediate can always be emulated by adding a branch out to the correct catch
scope and then doing a rethrow-without-immediate? If so, I'm not convinced by the composability argument since that's a fairly simple transformation. I would be convinced if we had a usecase where rethrow-with-immediate would be used often enough that there would be a code size win from not having to do that transformation, but given our lack of rethrow use cases, I don't think that's a compelling hypothetical, either.
@takikawa
However I'm a bit concerned about the complication for implementing this, as I think the immediate argument for
rethrow
effectively introduces an implicit exception stack that's separate from the normal value stack (the previous design withexnref
was more consistent in this regard, with everything being on the value stack). This second stack may need to be explicitly tracked by an implementation, to ensure outer exception references remain live.
Don't you need to keep this exception stack anyway even without the immediate argument? As @tlively said, we can still go from the inner catch scope to outer catch scope (by a branch or fallthrough) and rethrow the outer exception from there.
@tlively
Is it correct that rethrow-with-immediate can always be emulated by adding a branch out to the correct
catch
scope and then doing a rethrow-without-immediate?
I think so.
but given our lack of rethrow use cases, I don't think that's a compelling hypothetical, either.
For C++ we lack use cases, but for languages that use rethrow
functionality like that in https://github.com/WebAssembly/exception-handling/issues/125#issuecomment-693436340, I'm not sure..?
That comment does not refer to any actual language. We currently have zero use cases. Python and C# both modify their exceptions when they rethrow them, and so they would have to implement their rethrow with throw
of a new exception payload. Java could just as easily throw
the rethrown exception. So we have multiple languages that would not use a rethrow
instruction to implement exception rethrowing, and we have another language that could just as well implement exception rethrowing with throw
.
Without knowing what rethrow
is useful for, we cannot evaluate its design alternatives. The clever translation @tlively gave may or may not work depending on what rethrow
is supposed to be doing. Until we have concrete examples of how it's supposed to be used and what purpose it's supposed to fulfill, we can all picture different variations of what rethrow
does and is for and unintentionally speak past each other. We cannot effectively design features without use cases for those features.
Don't you need to keep this exception stack anyway even without the immediate argument?
@aheejin Thanks, I think you're right about this, nevermind my comment about rethrow
then.
@aheejin, I have a really dumb question: how did you implement a correct semantics of finally
or destructors under the previous proposal without using rethrow
? As far as I can tell, rethrow
is the only way to (re)throw exceptions whose constructor you don't know. It seems that if you don't use rethrow then any such feature will misbehave on foreign exceptions, e.g., JS ones.
@rossberg (I wrote and deleted a comment just not and please ignore that)
Sorry, I think I said something confusing. Our C++ implementation currently uses rethrow
for
When an (both C++ and foreign) exception doesn't match the current catch
. The current catch
catches everything. With the new proposal decided, we use catch (C++ tag)
, but that doesn't mean we catch all C++ exceptions (i.e., in some catch
we only catch int
or something), so we have to still use rethrow
if an exception doesn't match. This use case is actually not rethrowing but resuming, and it will be replaced by filter functions/blocks after two-phase unwinding is introduced later. In the current single-phase, rethrowing and resuming aren't different.
After running destructors. This is also a resuming behavior and will be replaced by try
-unwind
-end
. (In the single phase, the exception will be effectively rethrown at the end
, even though there's no explicit rethrow
). C++ doesn't have finally
, but for the languages that have it, clang compiles it away, so it will be handled in a similar way as destructor blocks.
What I tried to say was we don't use rethrow
for C++'s throw;
, which is C++'s rethrowing keyword, but apparently I said things in a very confusing way. The reason is C++'s throw;
apparently doesn't preserve stack traces. Our current C++ implementation prototype's catch (...)
doesn't catch foreign exceptions, mainly for implementation simplicity. We may want to change this later.
@aheejin, that makes sense, thanks for the clarification.
So this seems to imply that we clearly need rethrow
, unless we want to introduce all of two-phase unwinding now.
As far as I can tell, rethrow is the only way to (re)throw exceptions whose constructor you don't know.
The current catch catches everything. With the new proposal decided, we use catch (C++ tag), but that doesn't mean we catch all C++ exceptions (i.e., in some catch we only catch int or something), so we have to still use rethrow if an exception doesn't match.
Notice that this is "rethrowing" an exception with a known tag and available payload. rethrow
is not necessary for this. You can just use throw
, and the two programs would be semantically equivalent.
@RossTate
Notice that this is "rethrowing" an exception with a known tag and available payload.
rethrow
is not necessary for this. You can just usethrow
, and the two programs would be semantically equivalent.
Without rethrow
,
throw;
. C++ throw;
doesn't preserve stack traces but this should. This is effectively uncaught exception and should be skipped by a filter function later when we have two-phase, so in the single phase, in case stack traces are embedded in the exception, this shouldn't be lost.I think we have a very clear use case for rethrow
in the current toolchain for all these reasons. And also, I think we agreed on the proposed spec like yesterday...? Not sure if we want to spend more time to discuss taking something out of that agreed spec only after a day. I opened this issue to discuss whether we need the immediate for rethrow
.
i32
) in a local variable. There are likely other reasons to do this anyways (e.g. to access the payload multiple times).I opened this issue to discuss whether we need the immediate for rethrow.
For me, the answer depends on what rethrow means and how rethrow will be used. That said, it does sound like there's reason to believe that every hypothetical use does not need an immediate, so I'm personally fine with you closing the issue with that answer.
@RossTate
- You put the payload (which I believe for C++ is currently just an
i32
) in a local variable. There are likely other reasons to do this anyways (e.g. to access the payload multiple times).- Barely.
- Stack traces are not part of the semantics of WebAssembly exceptions. They are a debugging tool, and there are other, more robust ways for engines to track stack traces to this end.
Without two-phase unwinding, embedding stack traces or other info in exceptions is a way that we can provide some info. We didn't specify which info we should provide in the spec to maximize each engine's leeway according to their necessity though. We are currently using it, and I think this is sufficient to attest to the "Is there a use case" argument.
For me, the answer depends on what rethrow means and how rethrow will be used. That said, it does sound like there's reason to believe that every hypothetical use does not need an immediate, so I'm personally fine with you closing the issue with that answer.
Not sure exactly what you mean. The objective here is to decide whether we should restore the immediate back or not, and if we decide we should, propose to the CG meeting about the change. Have we made the decision?
@RossTate:
Notice that this is "rethrowing" an exception with a known tag and available payload.
To implement destructors or finally in a heterogeneous language environment you need to be able to rethrow exceptions whose tag you do not know. For example, when C++-Wasm calls back into JS and JS throws.
To implement destructors or finally in a heterogeneous language environment you need to be able to rethrow exceptions whose tag you do not know
@rossberg That has been addressed by try
/unwind
.
Without two-phase unwinding, embedding stack traces or other info in exceptions is a way that we can provide some info.
@aheejin An engine can have a global variable for the stack trace. If a function throws an exception, then the engine can initialize this variable to the empty stack. As it searches for a catch
, it can add frames to this stack. If it finds a catch
, then it can leave the stack untouched until the catching function returns, at which point it clears the variable, but if the function throws again then it continues to build upon the stack. This will work even for languages that modify exceptions as they go up the stack. It also can be used to get stack traces for traps.
We also want to consider that there are uses of the exception handling proposal that do not want you collecting stack traces because it is a waste of time, i.e. they are using it for dynamic control flow and know the thrown exception will be caught.
Applications that do want a stack trace can always explicitly put one into the payload. I prefer this approach because it takes guesswork out of engines (should the engine trace the stack or not? answer depends on the application).
@RossTate
I didn't say rethrow
is the only way that we can provide stack traces. This is a way we are currently using it (with single phase unwinding), and was included in the new spec we agreed on, and we have to change our implementation even further without it, so I'd say we keep it. Of course, it is not specified in the spec what kind of auxiliary info a VM should include, so a VM can choose not to provide any side info, and that's fine.
I answered your "is there a use case?" argument. Given that we implemented it, are using it, and haven't had a major problem with it, I don't think every single aspect of the existing spec should also satisfy "Is this the only way to do this thing?" question.
Have we made the decision?
My impression is that there's no need for an immediate in any hypothetical use. @rossberg is I think the only person we're waiting for updated thoughts from.
My point still stands I believe. Implicit naming of nestable constructs is a basic structural design mistake. I gave an example. What would be the advantage of artificially crippling the instruction that way?
Note that specifying exception to rethrow was capable using exnref
in the previous (meaning the spec before this CG meeting) spec, so we are not introducing any new functionalities.
@rossberg Not having an index makes the instruction size smaller for the extremely common case. And having an index would be redundant because it can easily be encoded through existing constructs.
@rossberg Not having an index makes the instruction size smaller for the extremely common case.
Barely. And it requires the toolchain to do some unnecessary code transformation in the other cases, increasing complexity and code size there.
@RossTate:
Not having an index makes the instruction size smaller for the extremely common case.
In this thread you seem to be simultaneously arguing that the entire instruction is useless and that it is gonna be extremely common. I have a hard time reconciling both these claims.
I believe the use cases for the instruction could just as easily be served by throw
. But if we're going to have such an instruction, it sounds like it will be used in a very particular pattern, namely the following:
(catch $exn
(local.set $exn_payload)
(br_if $exn_handler (some_test (local.get $exn_payload)))
(rethrow)
)
If that's the case, then that suggests we should have catch
rethrow the exception whenever the end is reached (instead of having a rethrow
instruction). Once we do that, then catch
and unwind
are almost identical except that one is only used for certain exceptions and provides the payload. (This change has the added benefit of making the type annotation on try
unnecessary, though actually taking advantage of that is a separate discussion.)
Taking things a step further, languages like Python modify the exception as it goes up the stack. Rather than having catch
implicitly store the payload for when the exception gets rethrown at the end, we could instead require the end of the block to provide a new payload. At this point, catch
is really doing unwinding with a payload (which is a more advanced form of unwinding), so we might as well rename it to unwind_tagged
or something.
Putting this all together, the pair of unwind
and unwind_tagged
has strictly more functionality than the quadruple of unwind
, catch
, catch_all
, and rethrow
while at the same time likely being a more compact encoding of the common patterns. (Sorry, just thought of this idea.)
@RossTate Umm... Can we please not turn everything upside down again and bikeshed new instructions two days after we passed a new spec...? Brainstorming and bikeshedding language design can be fun, but I think we really need to settle down and ship things.
Also, I think the approach when designing new instructions and changing already existing and implemented instructions should be different. We can't spend endless time bikeshedding things, and I think changing existing things require a stronger argument than designing things first time unless there's a clear problem that people agree on.
(By the way, usage or rethrow
is not at the end of end
of catch
most time, as you suggest; it is only the case for destructor blocks, which will be replaced by try
-unwind
.)
My guess is that rethrow
will be a very rare instruction, even in a large program using exceptions everywhere, so I wouldn't worry too much about the code size. If we're concerned that rethrow
may need an immediate in the future, then we can do what we've done before here, and provide a required 0
immediate, which can be extended later if necessary.
@binji As I said in https://github.com/WebAssembly/exception-handling/issues/126#issuecomment-694152985, it's not rare at all; currently it's used all the time. The usage 2 will be replaced by try
-unwind
though. The usage 1 will remain until we actually have filters and two-phase unwinding.
And both 1 and 2 are not rethrowing but actually resuming. (In single-phase they are effectively the same.) C++ does not use rethrow
for actual "rethrowing", which is throw;
. But other languages might, and that's what @rossberg's use case suggests.
Sorry, I mean rare compared to the total number of instructions. But if you have data that would be useful -- what percentage of the instructions in the code section for a medium-sized module are rethrow
?
@binji I just checked a single file that contains ~35000 instructions and rethrow
was a bit less than 1%. But I think this will be reduced by more than 80% I think after we switch to try
-unwind
, so I wouldn't worry about the code size for it either. But without the immediate I think it is possible that the toolchain has to do some unnecessary code transformation, inserting branches and all. (C++ is not likely to need that though.)
Clarification request: which of these programs are supposed to be equivalent?
(try
(call $f)
catch $exn
(throw $exn)
)
(try
(call $f)
catch $exn
(rethrow)
)
(call $f)
@RossTate 2 and 3 will have the same throwing site. The two programs are different still, so not sure what the definition of equivalence is though.
Equivalence informs tools which program transformations/optimizations are valid. Inlining also changes throwing sites and stack traces. So it seems like either inlining functions that might throw exceptions is an invalid optimization, or the above three programs are equivalent. If the three programs are equivalent, then every program with rethrow
can be equivalently expressed without rethrow
.
Stack traces are usually used for debugging without optimizations turned on. In your example two throwing sites are very close, but if rethrowing happens later in another caller up in the call stack, the stack traces will be vastly different. Your argument that inlining can make things the same is like, all stack traces that ever exist can be the same if all functions are inlined into main
. Not sure if that makes sense.
And as I said, I think now is the time to actually make things work and ship things. We made a new spec two days ago. We can still consider changes, but I think they should satisfy one of these:
Like @RossTate, I'm not totally convinced that rethrow
is necessary. However, I also don't think it's a significant complication to the proposal. AFAIK, the implementation in V8 currently embeds stack traces in thrown exceptions. I know those stack traces do not necessarily match the stack trace semantics of any particular source language, but I think it's plausible that they might be useful for debugging regardless of that. I don't know for sure because we don't have any concrete users of this proposal yet. Since those stack traces are a debugging tool that is not visible in Wasm semantics, in practice it's ok to perform optimizations like inlining that change them, even if in theory it is not.
Given all that, I think a reasonable course of action would be to keep rethrow
, taking the risk that it turns out not to be useful. A similar line of reasoning suggests that we should add the immediate index to rethrow
as well. Again, it may not turn out to be useful, but it's also not a large cost, so the expected value of adding it just in case and moving on with implementation seems higher to me than the expected value of continuing to disagree about it.
I feel we need to be explicit about the expectations on stack tracing. When I asked language implementers for advice on exception handling, a common sentiment was to not have stack tracing as a default. It apparently substantially slows down the use of exceptions for dynamic control flow. Most implementations of efficient dynamic control flow either use two-phase approaches (so that they can collect the stack only after they find out the exception is uncaught) or do not collect the stack trace (except for possibly keeping the portion of the stack above the last handler/unwinder in tact).
As an example, for stack switching we think it makes sense to build on this proposal as it provides a mechanism for dynamic control flow. But it would be problematic if doing so causes a stack trace every time, an operation that could easily dominate the time of the stack switch itself.
Plus, as I pointed out above, there are other ways to build and track a stack trace than putting it in the exception payload. The technique I mentioned above would give the same stack trace for all three programs (that I believe should be considered equivalent). Another is for the application to make the stack trace an explicit part of the payload (as an externref
) and import a function to collect the stack trace upon building the initial exception (and the application's JS layer could provide a different function to the import depending on whether the application is being debugged or not).
I think I'm repeating myself at this point, so I'll summarize my answers here one more time:
And one more thing... this issue was about the immediate argument and I hoped this issue to focus on that. Whether we need rethrow
at all is a much bigger request for change, which we have not considered in all three versions of the spec we made, so while I don't think we need to make big changes anymore unless there's a really good reason, I think it would've been better if you made a separate issue post.
Fair request. I've moved discussion of whether to have rethrow
to #127.
We left out
rethrow
s immediate argument in the last CG meeting's poll because we weren't sure about the use cases, but @rossberg suggested one in https://github.com/WebAssembly/exception-handling/issues/125#issuecomment-693436340.I think having this functionality is in general good as long as there exist languages that
rethrow
instruction for their rethrowing functionality, i.e., want to relay some auxiliary info when rethrowingWhat do you think?