patrickbkr commented 10 months ago

What should happen when a LEAVE phaser contains a return statement?

Given these four snippets, what should each of those print?

1.

sub s() {
    LEAVE return 5;
    return 7;
}
say s()

2.

sub s(&code) {
    LEAVE return 5;
    code();
}
sub s2() {
    s({
        return 7;
    });
    return 1;
}
say s2()

3.

sub s() {
    LEAVE return 5;
    7;
}
s()

4.

sub s(&code) {
    LEAVE return 1;
    code();
}
sub s2() {
    s({
        return 7;
    });
}
s2()

patrickbkr commented 10 months ago

With https://github.com/MoarVM/MoarVM/pull/1785 and https://github.com/MoarVM/MoarVM/pull/1782 merged current rakudo prints:

5
1
5
1

I especially wonder about 2. and 4. It's comprehensible what's happening and why, but feels like a WAT. Maybe it's simply a DIHWIDT, similar to what next does here:

sub s() { next; }
for ^10 { s(); say $_; }

lizmat commented 10 months ago

In my opinion a return in a LEAVE phaser shouldn't be different from a return in any other phaser, e.g. ENTER or FIRST. With the new Raku grammar, FIRST appears to be doing that apparently already:

sub a { FIRST return 42 }; dd a;  # 42

So in my book, the return values would be: 5, 1, 5, 1

lizmat commented 10 months ago

Or we should disallow return from phasers, period. And make it report an appropriate execution error (or possibly a compile time error in RakuAST).

vrurg commented 10 months ago

There is a problem with cases 2 and 4. Both has to be 7. And here is why.

Return from a block applies to the closure where the block belongs to. This is why .map({ return 7 }) works as expected. From this point of view it doesn't matter what happens inside of sub s and what it returns – but return within the block must win.

Or we should disallow return from phasers, period.

Better not.

lizmat commented 10 months ago

Ok, thinking about this more: I change my answers to 5, 5, 5, 1. And here's why:

In case 2, since &code is a Block, the return is applied to s and not to s2. However, that return implies that it is leaving the scope of s, and thus its LEAVE phaser will fire causing it to return 5.

I don't see an issue with 4. The same logic as 2. applies here.

vrurg commented 10 months ago

@lizmat you're wrong. Letting a block return from s2 via invoking &code makes the code in s2 unpredictable and will require additional complications like checking if it's a block or not in order to prevent the block from breaking our behavior. Not to mention that it wouldn't be backward compatible:

sub foo(&c) { 
    my \rc = &c(); 
    say "foo rc=", rc.raku; # no output
    rc 
} 
sub bar() { 
    foo { return 1 }; 
    return 2  
}
say bar; # 1

I will remain in the position that a block must return from its closure.

patrickbkr commented 10 months ago

I think it will help to explain what Rakudo currently does technically. Looking at the 2nd example:

sub s(&code) {
    LEAVE return 5;
    code();
}
sub s2() {
    s({
        return 7;
    });
    return 1;
}
say s2()

s2 is entered
s is entered
&code is entered
return 7 is called
throwpayloadlexcaller is called (https://github.com/MoarVM/MoarVM/blob/main/src/core/interp.c#L4908C3-L4908C3)
A CONTROL handler is searched in the lexical scope of return 7. (MVM_exception_throwpayload and then search_for_handler_from)
The handler of s2 is found and a frame unwind (following the caller chain) is started targetting the handler in s2 (MVM_frame_unwind_to)
During that unwind the following two frames are unwound in order: {return 7}, s()
The exit handler of the s() frame is seen. The unwind process is paused (not aborted, only paused), and the LEAVE block is executed.
return 5 is called
throwpayloadlexcaller is called
A CONTROL handler is searched in the lexical scope of return 5.
The handler of s is found and a frame unwind (following the caller chain) is started targetting the handler in s
During that unwind only the LEAVE return 5 frame is unwound.
The handler in s() is reached. s() returns and hands the value 5 to s2().
Execution continues in s2().
The returned value 5 is sunk.
return 1 is called. Same process as above.
1 is sayed

I think the behavior is consistent in that once I understood that &returns are effectively exceptions that search handlers in the outer chain and then unwind the caller chain, the resulting behavior is very comprehensible and expected. Departure from the described behavior will mean adding additional logic to the behavior of stack unwinding. So there would be more logic to know about to fully grasp what's happening.

@vrurg I believe the above behavior is interesting, because your requirement

Return from a block applies to the closure where the block belongs to.

is upheld. (I strongly support this requirement.) But the return 5 in s still manages to interrupt the unwind of return 7 with lasting effect.

Just to state a possibility of how we could adapt the above logic: We detect the situation where a return (return 5 in the above code) interrupts the unwind of some other return (return 7). Then

if both exceptions are CONTROL {
    if target handlers of both exceptions are equal { # This means the returns are in the same lexical scope
        ... do interrupt the first unwind and proceed with the new return value
    }
    else {
        ... let the first unwind continue, ignore the second `return`
    }
}

This would have the consequence that the results would be:

5
7
5
7

I'm undecided if I like this or not. It limits action at a distance: The return 5 in s doesn't change the effect of return 7 in s2. But it also adds action at a distance: The return 7 in s2 does change the behavior of the return 5 in s.

lizmat commented 10 months ago

All of this explanation that is needed to explain behaviour, is convincing me that a return should NOT be allowed in a phaser. Sadly, there is no way to reliably enforce that at compile time.

But we could at run-time by codegenning a CONTROL block in any phaser block. Conceptually:

LEAVE {
    say "bye";
    return 42;
}

would be codegenned as:

LEAVE {
    CONTROL {
        die "return within a LEAVE phaser" if $_ ~~ CX::Return
    }
    say "bye";
    return 42;
}

Of course, if the return were to be directly in the phaser's body, we could make that a compile time error as well.

Since phasers cannot be inlined anyway, I don't think this will affect performance of phasers much.

lizmat commented 10 months ago

To actually not break any code out there that is depending in old "return in phaser" behaviour, we could limit this CONTROL codegenning to 6.e.

vrurg commented 10 months ago

I'm undecided if I like this or not. It limits action at a distance: The return 5 in s doesn't change the effect of return 7 in s2. But it also adds action at a distance: The return 7 in s2 does change the behavior of the return 5 in s.

What would it change in return 5 behavior? The returned value is not going to be used by definition. So, we can sink it immediately and for the code in s it will go unnoticed.

The noticeable change is that immediate termination of return procedures will prevent a potential CONTROL block in s from being invoked. Perhaps, this can be easy to fix by adding a special if branch which would try the path of seeking for the CONTROL block in the lexical scope without initiating a new unwind.

So, I currently see no cardinal changes that might break s semantics or user expectations about it.

All of this explanation that is needed to explain behaviour, is convincing me that a return should NOT be allowed in a phaser.

While not being fully against this prohibition, I'd not appreciate it. Besides, consider that CATCH and CONTROL are phasers too. Returning a value from CATCH is often an important way to handle an error.

Sadly, there is no way to reliably enforce that at compile time.

Again, hope this won't be needed. But technically, an explicit return can be detected as soon as:

the current lexical context belongs to a phaser (the compiler knows it)
the current lexical scope doesn't belong to a routine (pointy block included)

Both being True means this return is going to cause a trouble.

But we could at run-time by codegenning a CONTROL block in any phaser block.

Even if we need it, I wouldn't be happy about extra implicit code injection. A VM can be more efficient in implementing this. And more universal.

But either way, we haven't considered all possibilities of optimizing the case. The fact that any return within frames, affected by unwind, is not going to be used makes a lot of things easier, as I see it. All we need is to make sure all related CONTROL blocks are fired up.

patrickbkr commented 10 months ago

Adding case 5.

sub s(&code) {
    return 0;
    LEAVE {
        code()
    }
}

sub s2() {
    s({
        return 1;
    });
    return 2;
}

say s2();

This is in some sense the inverse of case 2. (I don't see how we can distinguish the two cases, i.e. to make case 2. give 7 and case 5. give 1.)

Current Rakudo (with the two PRs applied) prints 1.

vrurg commented 10 months ago

Adding case 5.

It makes no difference with 2. return 1 wins, so it makes 1 eventually.

More interesting the case of s itself. Say, &code is not a block but a routine returning a value. Let it be the 1 we already return. Then we have two options:

no explicit return makes the LEAVE block mute and only the side effects of it are what's important
since LEAVE is the last statement and is executed after the return 0 its return value is the return value of s

But the second option loses for two reasons. First, the return 0 statement is explicit. Second, it would make it harder to learn the rules governing returns from routines.

patrickbkr commented 10 months ago

@vrurg I try to explain some more why I think case 2. vs case 5. is difficult.

case 2.

sub s(&code) {            # target of second return
    LEAVE return 5;       # second return -> collision
    code();
}
sub s2() {                # target of first return
    s({ return 7 });      # first return
    return 1;
}
s2()

case 5.

sub s(&code) {          # target of first return
    return 0;           # first return
    LEAVE { code() }
}

sub s2() {              # target of second return
    s({ return 1 });    # second return -> collision
    return 2;
}
s2()

Let's consider the point during execution marked with "collision".

In case 2 that's return 5. At that point some other return is already in progress (return 7). That other return targets a different routine (s2()).

In case 5 that's return 1. At that point some other return is already in progress (return 0). That other return targets a different routine (s()).

When working with the information given in the previous two paragraphs, the outcome is either 5 and 1 or 7 and 0. I don't see what additional information could be used to help.

vrurg commented 10 months ago

Hope, we're not discussing different things here. Also hope that you're not looking at the problem from the perspective of implementation.

Separate the concerns. Separate the sources of s and s2. Best if you separate them into modules from different developers. Now, from the point of view of s2, should it care about what happens inside s? It clearly should not. Because in either case 2 or case 5, the block submitted to s returns from s2. This is what I would see as a developer of s2.

Think of it from the day-to-day development. The author of s must remember to document that it's sub is using a LEAVE phaser which is using return. I matters not if one or the other is not used, but when together – we need to document it.

The author of s2 must remember, that s has a phaser which is using a return. No other routine from this module doesn't, but this one does. Ah, and that the return in LEAVE only fires if this, and this, and, perhaps, that condition are all met. And then there is a couple of other modules, and a routine in one of them has similar behavior. But which one??

That's why I say 2 and 5 are the same. Because all what matters here is s2 as the consumer.

patrickbkr commented 10 months ago

Also hope that you're not looking at the problem from the perspective of implementation.

Maybe "inspired by"... Can't prevent that.

Separate the concerns. Separate the sources of s and s2. Best if you separate them into modules from different developers. Now, from the point of view of s2, should it care about what happens inside s? It clearly should not. Because in either case 2 or case 5, the block submitted to s returns from s2. This is what I would see as a developer of s2.

I see what angle you are coming from and I agree.

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it?

vrurg commented 10 months ago

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it?

I would tentatively agree. Not sure if we miss no nuance here.

patrickbkr commented 10 months ago

I have added an implementation of the "deeper unwind wins" approach in https://github.com/MoarVM/MoarVM/pull/1786. Spec tests pass including those in https://github.com/Raku/roast/pull/851

patrickbkr commented 10 months ago

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it? Not sure if we miss no nuance here.

We do. Always picking the return that unwinds deeper basically disables &return everywhere in respective phasers. That messes up all but the simplest code. It's a miracle that PR managed to get the spec tests to pass.

patrickbkr commented 10 months ago

Another idea:

Before calling any exit handler during unwind, we mark the target frame as this frame is returning and save the return value in the target frame (in frame->extra->exit_handler_result). We could also do that right away on every &return, but that would make all &returns pay the performance penalty.
On every implicit return (that's all returns, not just &return), check the caller frame for the returning marker. If found, then unwind the current frame, restore the previously saved value and do another unwind. That's the equivalent of two implicit returns in a row (+ restoring the right return value).

I think the semantics are correct and the approach to have frames in a "returning" state feels like it matches the core of the issue.

I see issues though:

We add one if to every (implicit and explicit) return. This is not good for performance. Probably negligible, but I test this yet.
From an opcode perspective, is this behavior too much magic for throwpayloadlexcaller and the return_*?
We make return fundamentally more complex. Returning is involved in inlining and specialization. This makes all of that more complex. I think this point is the most problematic one.

patrickbkr commented 10 months ago

Yet another idea: We tune the target frames return_address to point at the return handler. To retrieve the correct value, we'll change lastexpayload to check for the frame->returning flag and if set, retrieve the value saved in frame->extra->exit_handler_result instead of tc->last_payload.

patrickbkr commented 10 months ago

I have implemented the last idea in https://github.com/MoarVM/MoarVM/pull/1788, it turned out rather nice. I have also added another test to https://github.com/Raku/roast/pull/851. It passes all those tests. I haven't done a spec test (need to add a JIT implementation to https://github.com/MoarVM/MoarVM/pull/1788 first), but this looks very promising.

patrickbkr commented 9 months ago

JIT is in. So we now have a working implementation. How do we proceed? I.e. how do we reach a conclusion?

I see two possible options:

The new semantics (summarized as: the ~~first return~~ last return in a given routine always wins) are accepted.
We keep the current semantics (return exceptions are not prioritized, last called wins).

I tend to not like:

Prohibit &return in phasers. (Even though I was the one to originally propose it.)

patrickbkr commented 9 months ago

Ping @vrurg, @lizmat.

lizmat commented 9 months ago

To summarize: there's an implementation now that implements: "first return wins", and it's spectest clean?

patrickbkr commented 9 months ago

To summarize: there's an implementation now that implements: "first return wins", and it's spectest clean?

~~Yes.~~ <-- That was wrong. It's last return in a routine that wins.

lizmat commented 9 months ago

The PR is still marked as draft. Maybe it shouldn't? :-)

patrickbkr commented 9 months ago

I can change that. But: We should decide if those semantics are what we actually want. From the above discussion I don't have the impression that we have reached a consensus yet.

So, can we first decide that we want those semantics, then go forward with the implementation. Right?

On February 25, 2024 3:13:45 PM GMT+01:00, Elizabeth Mattijsen @.***> wrote:

The PR is still marked as draft. Maybe it shouldn't? :-)

-- Reply to this email directly or view it on GitHub: https://github.com/Raku/problem-solving/issues/417#issuecomment-1962954284 You are receiving this because you authored the thread.

Message ID: @.***>

lizmat commented 9 months ago

+1 for first return wins.

We already have a similar behaviour regarding exit:

$ raku -e 'END exit 42; exit 1' 
$ echo $?
1

niner commented 9 months ago

I know I'm late to the party, but anyway.

S06 says "The C<return> function notionally throws a control exception that is caught by the current lexically enclosing C<Routine> to force a return through the control logic code of any intermediate block constructs. (That is, it must unwind the stack of dynamic scopes to the proper lexical scope belonging to this routine.)"

It says this about ENTER/LEAVE/KEEP/UNDO/etc. phasers: "These phasers supply code that is to be conditionally executed before or after the subroutine's C<do> block (only if used at the outermost level within the subroutine; technically, these are added to the block traits on the C<do> block, not the subroutine object). These phasers are generally used only for their side effects, since most return values will be ignored. (Phasers that run before normal execution may be used for their values, however.)"

The first quoted paragraph makes me inclined to suggest "lastest wins" semantics for return. After all return throws a control exception and with exceptions in general it's that any exception occurring after the original exception (e.g. in the exception handler) will become the dominant one that get's passed up to handlers. This matches what I would expect.

But then S06 is quite explicit about phasers, especially LEAVE being about their side effects. So now I'm a bit torn.

lizmat commented 9 months ago

being about their side effects

Could also be interpreted as "don't allow return in a phaser" ?

patrickbkr commented 9 months ago

@niner From all I've seen in this discussion I'm pretty confident to say that lastest win semantics are effectively non-composable and the only safe move is this to never use return in LEAVE. We'd be better off prohibiting return in LEAVE.

Reading through those snippets you provided, I suspect at the time people didn't realize that combining LEAVE and return can happen and what issues that poses. They provide a good implementers POV though.

On February 25, 2024 8:15:12 PM GMT+01:00, niner @.***> wrote:

I know I'm late to the party, but anyway.

S06 says "The C function notionally throws a control exception that is caught by the current lexically enclosing C to force a return through the control logic code of any intermediate block constructs. (That is, it must unwind the stack of dynamic scopes to the proper lexical scope belonging to this routine.)"

It says this about ENTER/LEAVE/KEEP/UNDO/etc. phasers: "These phasers supply code that is to be conditionally executed before or after the subroutine's C block (only if used at the outermost level within the subroutine; technically, these are added to the block traits on the C block, not the subroutine object). These phasers are generally used only for their side effects, since most return values will be ignored. (Phasers that run before normal execution may be used for their values, however.)"

The first quoted paragraph makes me inclined to suggest "lastest wins" semantics for return. After all return throws a control exception and with exceptions in general it's that any exception occurring after the original exception (e.g. in the exception handler) will become the dominant one that get's passed up to handlers. This matches what I would expect.

But then S06 is quite explicit about phasers, especially LEAVE being about their side effects. So now I'm a bit torn.

-- Reply to this email directly or view it on GitHub: https://github.com/Raku/problem-solving/issues/417#issuecomment-1963033737 You are receiving this because you authored the thread.

Message ID: @.***>

patrickbkr commented 8 months ago

This discussion seems to have stalled.

Let's summarize the options again:

The last return in a routine always wins. Returns in other lexical scopes have no say. That's the result of the discussion of lizmat, vrurg and me. An implementation exists. See the tests for how the implementation exactly behaves.
First return wins. All later returns have no effect. With the exception of a corner case (A routine with a single &return in its LEAVE block. I think that's a bit dubious.), that's similar to silently disabling returns in LEAVE. I think it would be clearer to instead prohibit &return in LEAVE (see 3.).
Prohibit &return in LEAVE.
Return exceptions are not prioritized, last thrown wins. That last return also determines where we are returning to. That's the current behavior. I think most, maybe all in this discussion agree that this is undesired.

@lizmat, @niner: I gave wrong information a few posts earlier, I mixed things up. There is no implementation of first return wins.

I'm in favor of 1.

Everyone in this discussion (@niner @vrurg @lizmat): Can you give a quick info where each of you stands? I think it's possible there isn't even a disagreement here.

vrurg commented 8 months ago

I still stand for 1, no changes here.

patrickbkr commented 7 months ago

@niner, @lizmat Ping! Can you restate your positions? See https://github.com/Raku/problem-solving/issues/417#issuecomment-1984532814

lizmat commented 6 months ago

I can live with 1. However, that is different from the behavior of exit, where the first exit determines the exit value.

niner commented 6 months ago

Re-reading this thing I'm with @vrurg that the correct result would e 5, 7, 5, 7. According to the design docs I quoted a return returns from the lexical scope it's contained in, i.e. s2. The LEAVE phasers in s' would still win and change the return value. But it'ss1`'s return value that they change which is in both cases just ignored! Again, the return in the LEAVE is returning from the sub that is the return's lexical scope. In the number 4 example it's not as obvious that the return value would be ignored, but if you imagine that as

sub s2() {
    my $result = s({
        return 7;
    });
    return $result;
}

then it's more obvious that because of the return 7 we never even get to set $result and thus s's return value overwritten by the LEAVE phaser is irrelevant.

So yes, a return in a LEAVE would change the return value of the sub it's located at. Callers and their return values must not be affected (other by what value they get from the call).

raiph commented 6 months ago

@patrickbkr

I've not been a participant in this thread (and I'm not a core dev) so there's that. However I've been repeatedly revisiting this thread to try understand it since January. I felt I should be able to understand what it was discussing, and felt it was important to me that I did, and that I should be able to engage before decisions were made even if I chose not to. I almost arrived at a point of comfort with the pending decision without commenting, but not quite.

In summary, presuming confirmation of my understanding below, I'm with the consensus (5,7,5,7 for the original four examples, 5,7,5,7,1 if one includes the one you added, and with the tests you linked for option 1).

In particular, my understanding is that, when you wrote:

From all I've seen in this discussion I'm pretty confident to say that lastest win semantics are effectively non-composable and the only safe move is this to never use return in LEAVE. We'd be better off prohibiting return in LEAVE.

You weren't talking about option 1.

I'd appreciate confirmation of that.

And regarding this:

Reading through those snippets you provided, I suspect at the time people didn't realize that combining LEAVE and return can happen and what issues that poses.

Your suspicion was/is moot (and, FWIW, not one I shared).

Plus I think @niner's related "torn" comment ("LEAVE being about their side effects") reflected them keeping their thinking explicitly open but still grounded in the verbiage they're referencing.

I note that that verbiage was qualified: "These phasers are generally used only for their side effects, since most return values will be ignored.". This is consistent with what I think you've arrived at, which is that they're ignored unless the return value comes from combining both of these:

An explicit &return in a LEAVE.
That &return mattering because the returned value matters. (Which it doesn't in some scenarios -- as @niner wrote "The LEAVE phasers in s1 would still win and change the return value. But it's s1's return value that they change which is in both cases just ignored!".)

I'd appreciate you commenting on this second half of my comment if it seems wrong to you.

patrickbkr commented 5 months ago

@raiph Thanks for your feedback!

In particular, my understanding is that, when you wrote:

From all I've seen in this discussion I'm pretty confident to say that lastest win semantics are effectively non-composable and the only safe move is this to never use return in LEAVE. We'd be better off prohibiting return in LEAVE.

You weren't talking about option 1.

Correct, I was talking about option 4.

This is consistent with what I think you've arrived at, which is that they're ignored unless the return value comes from combining both of these:
* An explicit `&return` in a `LEAVE`.

* That `&return` mattering because the returned value matters. (Which it doesn't in some scenarios -- as @niner wrote "The LEAVE phasers in `s1` would still win and change the return value. But it's `s1`'s return value that they change which is in both cases just ignored!".)
I'd appreciate you commenting on this second half of my comment if it seems wrong to you.

I believe we agree here.

patrickbkr commented 5 months ago

We've finally reached consensus. Accordingly I have just removed draft marker from the PR implementing the desired behavior (MoarVM#1788).

For completeness sake, there are four PRs related to this issue:

I'd like to get all four of them merged in the not too far future.

patrickbkr commented 5 months ago

I believe the interactions with other exception types need some thought as well. I guess &return in LEAVE shouldn't hijack in progress exception throws, right?

Intuitively I'd say that if a non-return exception is in progress, a return in LEAVE should never take over.

So is the distinction return exception <-> any other exception sufficient? Or are there other exception types that need considering as well?

raiph commented 5 months ago

is the distinction return exception <-> any other exception sufficient? Or are there other exception types that need considering as well?

Perhaps you can, for this PR, ignore the NYI leave (because it may never be implemented -- and if it is, then consideration of it can come then). If so, skip the rest of this comment.

Presumably the ordinary semantics of leave 3 would -- if it was implemented -- be that the innermost enclosing block would yield 3, as if the value of the last statement in that block was 3.

And presumably, even if a block containing a leave 3 was part of an enclosing closure that had been passed to a callee, the leave 3 would still have the same effect as before, even if there was a LEAVE phaser with a &return in the callee? For example:

sub foo (&code) { LEAVE return 1; print 2; code }
sub bar { foo { print do { leave 3; print 4 } }
print bar;

Presumably that is expected to print 2431.

And, eliding the print do ...:

sub foo (&code) { LEAVE return 1; print 2; code }
sub bar { foo { leave 3; print 4 } }
print bar;

Presumably that is expected to print 241 because it's semantically the same as:

sub foo (&code) { LEAVE return 1; print 2; code }
sub bar { foo { print 4; 3 } }
print bar;

Right?

raiph commented 5 months ago

Intuitively I'd say that if a non-return exception is in progress, a return in LEAVE should never take over.

I've got almost no intuitions. I don't know if that means my thoughts are useless or helpful. Two scenarios I thought about follow. I haven't tested them. My intent was to illustrate what I'm thinking about, so that's hopefully OK.

`leave`

I don't know if you've discussed leave (currently NYI) and/or have already decided it's out of scope, but in case it's to be considered:

sub foo (&code) {
    LEAVE { print 1; return 2 }
    code
}
sub bar {
    print foo { print do { leave 3 } }
}
bar;

`UNDO`

I don't know if you've discussed UNDO (which is included in its block's LEAVE queue) and/or have already decided UNDO/KEEP are out of scope, or will just not be allowed to use &return (which looks to me wise!), but in case UNDO is to be considered:

sub foo (&code, $result is copy) {
    CATCH { $result = Nil; .resume }
    UNDO { print 4; return $result but 5 }
    code
}
sub bar {
    print foo { print do { leave baz } }, 6
}
sub baz {
    X::AdHoc.new.throw if (True, False).pick;
    UNDO { print 7; return (8, Nil).pick }
    return (9, Nil).pick }

raiph commented 5 months ago

Have you read the original design doc verbiage related to this?

Excerpts from the last part of the Phasers section in S04:

Except for CATCH and CONTROL phasers, which run while an exception is looking for a place to handle it, all block-leaving phasers wait until the call stack is actually unwound to run. ... The stack is unwound and the phasers are called only if an exception is not resumed.

So LEAVE phasers for a given block are necessarily evaluated after any CATCH and CONTROL phasers. This includes the LEAVE variants, KEEP and UNDO. POST phasers are evaluated after everything else, to guarantee that even LEAVE phasers can't violate postconditions. ...

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks (also see "Definition of Success"). How this information is made available is implementation dependent.

An exception thrown from an ENTER phaser will abort the ENTER queue, but one thrown from a LEAVE phaser will not. ...

If a POST fails or any kind of LEAVE block throws an exception while the stack is unwinding, the unwinding continues and collects exceptions to be handled. When the unwinding is completed all new exceptions are thrown from that point. ...

The topic of the block outside a phaser is still available as OUTER::<$_>. Whether the return value is modifiable may be a policy of the phaser in question. In particular, the return value should not be modified within a POST phaser, but a LEAVE phaser could be more liberal.

patrickbkr commented 5 months ago

Idea: Let's distinguish CONTROL exceptions from CATCH exceptions. That's an already well established distinction.

Given the word "conflict" means: An exception thrown within a phaser during an unwind (a LEAVE) wants to unwind out of that phasers frame.

Then I propose the following rules:

A CATCH exception is in progress, a CONTROL exception conflicts. -> We ignore the CONTROL exception and continue with the CATCH exceptions unwind.
A CONTROL exception is in progress, a CONTROL exception conflicts. (That's what has been discussed at length in this issue.) -> We compare the target frames of the two exception handlers. The one unwinding more of the stack wins. If it's a tie, the later one (the one originating from within the phaser) wins. (This behavior is implemented in https://github.com/MoarVM/MoarVM/pull/1812.)
A CONTROL or CATCH exception is in progress, a CATCH exception conflicts. -> We add the exception to a list (located in the frame the current unwind targets). After the unwind finishes and before we run the handler, we throw all the collected exceptions. That's what S04 is proposing.

raiph commented 5 months ago

@patrickbkr

Given the word "conflict" means: An exception thrown within a phaser during an unwind (a LEAVE) wants to unwind out of that phasers frame.

I imagine your initial goal is to decide what to do for a LEAVE phaser in the scenarios you discuss.

Then, once you're confident there's core dev consensus for the behavior of a LEAVE phaser, you'll consider whether the behavior (for the same classes of conflicts that are identified for LEAVE -- presumptively the 3 classes you describe) are the same for the other LEAVE queue phasers (UNDO and KEEP).

A CONTROL or CATCH exception is in progress, a CATCH exception conflicts. -> We add the exception to a list (located in the frame the current unwind targets). After the unwind finishes and before we run the handler, we throw all the collected exceptions. That's what S04 is proposing.

Sounds sensible to me.

A CATCH exception is in progress, a CONTROL exception conflicts. -> We ignore the CONTROL exception and continue with the CATCH exceptions unwind.

The keywords listed in the doc as raising control exceptions are return, fail, redo, next, last, done, emit, take, warn, proceed and succeed.

To test my understanding of your proposal, I think the list of keywords whose corresponding control exceptions can't "conflict" as you've defined are something like return (!), redo, next, last, done, emit, proceed and succeed. Right? (When I say "can't" I mean I'm thinking things like they either logically can't, or can only do so given absurdist DIHWIDT coding that I'm not even sure is possible. I've not really thought through done and emit because I don't want to think about them. ;))

And presumably the ones that could "conflict" (though again, perhaps DIHWIDT applies) are at least take and fail. (What would warn do?)

I also wonder if the following should be considered before deciding what to do about these "conflicts":

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks ...

I mention that because I'm thinking that if that information has been made available to a LEAVE (is that already the case in Rakudo?) then that might alter the picture.

(Though perhaps one would just write separate KEEP and UNDO phasers if the picture is altered enough to matter?)

patrickbkr commented 5 months ago

To test my understanding of your proposal, I think the list of keywords whose corresponding control exceptions can't "conflict" as you've defined are something like return (!), redo, next, last, done, emit, proceed and succeed. Right? (When I say "can't" I mean I'm thinking things like they either logically can't, or can only do so given absurdist DIHWIDT coding that I'm not even sure is possible. I've not really thought through done and emit because I don't want to think about them. ;))

A simple example of a conflicting return:

sub s() {
    LEAVE {
        return 1 # <- wants to unwind to s()'s exit handler, that's outside of the `LEAVE` scope, conflicts
    }
    die "Sad"
}
say s();

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks ...

I mention that because I'm thinking that if that information has been made available to a LEAVE (is that already the case in Rakudo?) then that might alter the picture.

(Though perhaps one would just write separate KEEP and UNDO phasers if the picture is altered enough to matter?)

Actually I know little about phasers other than LEAVE. I kind of hoped to be able to evade having to dig into all those phasers to find a sensible solution to these issues. I'm probably naive.

One more unsolved issue: There are exceptions that can resume, some always do, e.g. emit. Intuitively I'd expect for an exception that "comes back" to continue to work in LEAVE irrespective whether a CATCH or CONTROL exception caused the LEAVE. But you can't know before hand whether an exception handler resumes or not. Maybe the solution is to forbid / ignore conflicting resumable CONTROL exceptions altogether.

Raku / problem-solving

LEAVE phaser with return #417

`leave`

`UNDO`