rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.81k stars 12.77k forks source link

Tracking issue for RFC 2046, label-break-value #48594

Closed Centril closed 2 years ago

Centril commented 6 years ago

This is a tracking issue for RFC 2046 (rust-lang/rfcs#2046).

Steps:

Unresolved questions:


Note from shepard: This is a very long thread. If you're just looking for the 2022 conversation in which this ended up stabilized, jump to the report @ https://github.com/rust-lang/rust/issues/48594#issuecomment-1186213313

SoniEx2 commented 6 years ago

Using "return" would have interesting implications for labeled ? (tryop questionmark operator thingy).

est31 commented 6 years ago

Use return as the keyword instead of break?

@mark-i-m and @joshtriplett have already spoken out against return, but I'll join in given that it is still apparently an unresolved question.

break (and continue) are only usable with a loop. If you can break something, you can continue it. (I don't think there's an obvious syntax to pick for continue on a block.)

In C, C++, Java, C#, JavaScript and probably more languages you are usually breaking a switch statement in order to prevent fall-through. Rust has solved this much nicer with | in patterns but people coming from those languages won't really see break as something only for for loops. Especially as Java and JavaScript expose the feature via break as well and not return.

The "rules to remember" argument works into the other direction really well however. From what I can tell it is a commonality of the mentioned languages as well as Rust that return only applies to functions and nothing else. So if you see a return, you know that a function is being left.

Labeling a block doesn't cause errors on existing unlabeled breaks

First, I think this happens very rarely as the labeled break feature is admittedly not something that will be used 10 times per 1000 lines. After all, it will only apply to unlabeled breaks that would cross the boundary of the block, not unlabeled breaks inside the block. Second, users of Rust are accustomed to complaints / error messages by the compiler, they will happily fix them! Third (this is the strongest point I think), if instead of labeling a block you wrap it into a loop, you already need to watch out for unlabeled breaks and there is no error message that conveniently lists the break statements, you've got to hunt for them yourself :).

nikomatsakis commented 6 years ago

Especially as Java and JavaScript expose the feature via break as well and not return.

This to me is the killer point. break from blocks is a thing in many languages. return from blocks...not so much.

Centril commented 6 years ago

Personally, I share @joshtriplett's view on using break instead of return, but it seemed to me like the discussion hadn't been resolved on the RFC... If you believe the question is resolved in the lang team, tick the box with a note =)

est31 commented 6 years ago

Just saying that I'm working on this. Don't need mentor instructions. Just to not duplicate any efforts. Expect a PR soon.

scottmcm commented 6 years ago

I'm still in favour of return over break, but I can agree to disagree here. Question resolved.

topecongiro commented 6 years ago

Currently (with rustc 1.28.0-nightly (a1d4a9503 2018-05-20)) rustc does not allow unsafe on a labeled block. Is this expected?

scottmcm commented 6 years ago

@topecongiro Yes, I believe it's intentional that this is currently allowed only on plain blocks. It might change in future, but given that this is such a low-level and unusual feature, I'm inclined towards that being a feature rather than a restriction. (On the extreme, I certainly don't want else 'a: {.)

mark-i-m commented 6 years ago

Definitely agree. Unsafe + unusual control flow sounds like something to discourage.

In a pinch, though, you could use:

'a: {
unsafe {...}
}

Right?

SoniEx2 commented 6 years ago

Actually, altho else does create a new lexical scope, it's not a block. The whole if-else is a block (kinda). So no, you wouldn't get else 'a: { you'd get 'a: if ... else {.

eddyb commented 6 years ago

else contains a block (expression). there is no "new lexical scope" without blocks. An even worse surface syntax position than else would be 'a: while foo 'b: {...}. (interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Centril commented 6 years ago

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

That's a great observation!

SoniEx2 commented 6 years ago

I think labels should be part of block-containing expressions, not blocks themselves. We already have precedent for this with loop. (As it happens, a plain block itself is also a block-containing expression. But things like if and loop are block-containing expressions without being blocks I guess.)

(Things like while or for shouldn't support label-break-value, because they could or could not return a value based on whether they complete normally or with a break.)

mark-i-m commented 6 years ago

@eddyb

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Only if break 'b re-checks the loop condition...

eddyb commented 6 years ago

@mark-i-m It's equivalent to 'a: while foo {'b: {...}}, the break wouldn't check the loop condition, the loop itself would, because the loop condition is checked before each iteration of the body block.

mark-i-m commented 6 years ago

Woah, I find that highly unintuitive. I expect break 'b to be basically goto 'b, meaning we never exit the loop body and the condition is not checked again...

mark-i-m commented 6 years ago

Oh :man_facepalming: I see...

mark-i-m commented 6 years ago

This is why I don't like labeled break/continue :/

rpjohnst commented 6 years ago

Well, we specifically don't have the ability to label these weird inner blocks, so I don't see the problem. break always means "leave this block" and, given the above restriction, there's no way for that to mean anything other than "goto the spot after the associated closing brace."

mark-i-m commented 6 years ago

My confusion was not specific to weird inner blocks, but I don't really want to reopen the discussion. That already happened and the community decided to add it.

SoniEx2 commented 6 years ago

Okay, I understand accessibility is a big issue with programming languages... however, labeled break is extremely useful if you write code like me.

So, how can we make labeled break more accessible?

mark-i-m commented 6 years ago

So, how can we make labeled break more accessible?

That's a great question. Some ideas I had:

As a first (admittedly biased) sample, my last (and first) encounter with labeled break in real code was not stellar: https://github.com/rust-lang/rust/pull/48456/files#diff-3ac60df36be32d72842bf5351fc2bb1dL51. I respectfully suggest that if the original author had put in slightly more effort they could have avoided using labeled break in that case altogether... This is an example of the sort of practice I would like to discourage if possible.

rpjohnst commented 6 years ago

That's... not labeled break?

scottmcm commented 6 years ago

Regarding how break and continue desugar into this, that was also mentioned in the original RFC discussion by the RFC's author; see https://github.com/rust-lang/rfcs/pull/2046#issuecomment-312680877

eddyb commented 6 years ago

I suppose the name break is suboptimal, but it is well-established for loops. A more "principled" approach would be to use the syntax return 'a value, which immediately continues execution "after" the block 'a, using the value value for that block.

SoniEx2 commented 6 years ago

@mark-i-m "not using a feature because it's not accessible" isn't "making said feature accessible".

How can we tweak labeled break so I get the full expressive power of the language while at the same time you stop complaining that your brain can't process flow control things as well as the compiler?

(Also, the code you linked doesn't seem to do with RFC 2046/label-break-value... did you perhaps link the wrong code?)

mark-i-m commented 6 years ago

That's... not labeled break?

(Also, the code you linked doesn't seem to do with RFC 2046/label-break-value... did you perhaps link the wrong code?)

That's true, it was a normal labeled continue before I changed it, but I think the same problems exist (and are possibly worse since the control flow of the rest of a routine may be affected by what value you return).

@mark-i-m "not using a feature because it's not accessible" isn't "making said feature accessible".

How can we tweak labeled break so I get the full expressive power of the language while at the same time you stop complaining that your brain can't process flow control things as well as the compiler?

Sorry, I don't mean to complain. If I am the only person who feels this way, I don't mind stepping aside.

IMHO, this is a fundamental problem with labeled break/continue: it's too expressive, and the only way I know to mitigate it is to have recommended usage as "good style" (whatever that means). For example, "only use labeled break with value from the beginning or end of a loop body (not the middle)." This would mean that possible ways to exit a loop with a value are easy to spot and reason about.

SoniEx2 commented 6 years ago

This is how I avoid goto/labeled break in other languages: https://gist.github.com/SoniEx2/fc5d3614614e4e3fe131/#file-special-lua-L4-L72

Is that more readable?

If so, perhaps we can figure out some sort of conditional system based on labeled blocks. Similar to this, maybe.

rpjohnst commented 6 years ago

The point of unlabeled break and continue is to provide for those cases where you can't easily put the condition/value in the loop header. Some loops are simply far more straightforward, readable, fast, etc. with the break in the middle.

The point of labeled break and continue in loops is similar- sometimes the only alternative is to introduce a new variable, a new function (thereby abusing return), or some other contortion that only makes things harder to follow, however unfortunately-convoluted a labeled break may be.

But those two features are not what this thread is even about. They're fairly universal, precisely because of the improvements they offer for expressing inherently-complex control flow. This thread is instead about breaking out of a non-loop block. This is certainly more novel, and people may not know to look for a break outside of a loop, though requiring a label means the signal is still there once you know what it means.

This is what I meant about your example, @mark-i-m- it was a fairly standard use of a labeled loop, rather than a labeled block.

rpjohnst commented 6 years ago

A more "principled" approach would be to use the syntax return 'a value, which immediately continues execution "after" the block 'a, using the value value for that block.

Side note about break vs return: I prefer break because it's static control flow. return is dynamic, in that it goes back to the caller, which may be anywhere at all. It means "I've completed my responsibility, there's nowhere else to look to see what I do." break always goes somewhere that's lexically in scope, just as with loops.

Centril commented 6 years ago

I think return 'label expr reads really well from a "do what I say" perspective in that it can be thought of as "return expr to the location 'label". I don't think break 'label expr reads equally well in this perspective...

In isolation from other programming languages, I might have therefore advocated return 'label expr. However, given control flow in other languages, reusing return becomes suddenly a much less viable option and this sways me in favor of break 'label expr.

retep998 commented 6 years ago

I'm firmly of the opinion that it should be break 'label expr and not return 'label expr. Doing otherwise would be totally inconsistent with our existing usage of break 'label in loops.

mark-i-m commented 6 years ago

@SoniEx2 I think I do prefer the snippet you posted, largely because the variables are a good way of documenting loop invariants. OTOH, It might be possible to do the same with the names of labels (i.e. any time this labeled block is entered, invariant P holds). I guess it this is a place where it would be good to have a few more code samples from the wild...

Centril commented 6 years ago

Stabilization proposal

The feature was implemented in https://github.com/rust-lang/rust/pull/50045 by @est31 and has been in nightly since 2018-05-16 (+17 weeks) and so it has baked enough for stabilization. Furthermore, there have been no issues reported since the PR that implemented this was merged. All unresolved questions have also been resolved since, and there is strong consensus that break should be the surface syntax instead of return.

Thus, I move to stabilize label-break-value (RFC 2046).

@rfcbot merge

Report

TODO before FCP

@rfcbot concern FIXME-in-tests

The last test file currently has a FIXME:

// FIXME: ensure that labeled blocks work if produced by macros and in match arms

This should be resolved before stabilizing. I wrote some tests to check that the expected behavior wrt. the FIXME is actually implemented:

// run-pass

#![feature(label_break_value)]

#[test]
fn lbv_match_test() {
    fn test(c: u8, xe: u8, ye: i8) {
        let mut x = 0;
        let y = 'a: {
            match c {
                0 => break 'a 0,
                v if { if v % 2 == 0 { break 'a 1; }; v % 3 == 0 } => { x += 1; },
                v if { 'b: { break 'b v == 5; } } => { x = 41; },
                _ => {
                    'b: {
                        break 'b ();
                    }
                },
            }
            x += 1;
            -1
        };

        assert_eq!(x, xe);
        assert_eq!(y, ye);
    }

    test(0, 0, 0);
    test(1, 1, -1);
    test(2, 0, 1);
    test(3, 2, -1);
    test(5, 42, -1);
    test(7, 1, -1);
}

#[test]
fn lbv_macro_test() {
    macro_rules! mac1 {
        ($target:lifetime, $val:expr) => {
            break $target $val;
        };
    }
    let x: u8 = 'a: {
        'b: {
            mac1!('b, 1);
        };
        0
    };
    assert_eq!(x, 0);
    let x: u8 = 'a: {
        'b: {
            if true {
                mac1!('a, 1);
            }
        };
        0
    };
    assert_eq!(x, 1);
}
// compile-fail

#![feature(label_break_value)]

fn lbv_macro_test_hygiene_respected() {
    macro_rules! mac2 {
        ($val:expr) => {
            break 'a $val;
        };
    }
    let x: u8 = 'a: {
        'b: {
            if true {
                mac2!(2);
            }
        };
        0
    };
    assert_eq!(x, 2);

    macro_rules! mac3 {
        ($val:expr) => {
            'a: {
                $val
            }
        };
    }
    let x: u8 = mac3!('b: {
        if true {
            break 'a 3;
        }
        0
    });
    assert_eq!(x, 3);
    let x: u8 = mac3!(break 'a 4);
    assert_eq!(x, 4);
}

Tests similar to these should be added before we move from proposed-FCP to FCP.

rfcbot commented 6 years ago

Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

nrc commented 6 years ago

@rfcbot concern cost-benefit

From the RFC FCP proposal:

Another group felt that the "cost benefit" of this feature to the language didn't quite make the cut. In other words, the increased complexity that the feature would bring -- put another way, the possibility that people would actually use it and you'd have to read their code, I suppose, as well as the overall size of the language -- wasn't commensurate with its utility.

I still don't think we should have this feature at all. Since it's been implemented has it been used a lot? In the cases that it has been used is it significantly worse to use functions? If there is a benefit here, does it outweigh the cost of making the language larger and more complex?

joshtriplett commented 6 years ago

@nrc I share the same concern. I do understand the argument for having it available so that macros can use it, but at the same time, I'd much rather not have this at all.

I don't want to retread arguments from the original RFC thread, but I do think that this is a reasonable point to ask about experiences with this feature. What uses has it seen?

SoniEx2 commented 6 years ago

hand-written parsers, mostly... (I like my hand-written parsers >.<)

would be more useful/easier to use with labeled-propagate (try_foo() 'bar?).

joshtriplett commented 6 years ago

@rfcbot concern use-cases

Summarizing some discussion on Discord: We'd like to see concrete use cases for this, from actual code, that don't come across more clearly when rewritten to not use this feature.

petrochenkov commented 6 years ago

~FWIW, my implementation of EXPR is PAT relies on break 'label value being available at least in AST, I don't know how the desugaring could work without it. Implementation of EXPR? in the compiler relies on break 'label value as well.~

~It's some kind of a basic building block for larger control flow features, so it may be needed to be implemented in the compiler anyway. So the "cost" here is probably just making it available in surface syntax.~

EDIT: I've totally misinterpreted the issue, I though this is about loop { ... break 'label value ... }, not block { ... break 'label value ... }. I never had a chance to try this one because I always forget that it's already implemented.

eddyb commented 6 years ago

@petrochenkov Talking to @joshtriplett on Discord, they pointed out they were worried about user-facing complexity, not the implementation of the language.

est31 commented 6 years ago

I think the complexity increase is minimal: for readers, it should be obvious what this means because the concept already exists in loops, etc. I'd otherwise have to use a loop with an unconditional break statement, which is less clear and in fact there is even a clippy lint (never_loop) about this. So I think that there is a benefit.

As for the use cases, that topic has come up in the RFC already. I pointed out my use case here. Also see the use case listed by @scottmcm direcly below. Maybe there are more in the thread, idk. @joshtriplett does that resolve the use case question?

withoutboats commented 6 years ago

I agree with @nrc and @joshtriplett & I also want to raise a process concern here: we tentatively accepted this RFC with an explicit caveat that stabilization was blocked on revisiting the questions that @nrc and @joshtriplett have raised, but @Centril's merge proposal doesn't mention this blocking concern at all, and treats this as a very standard "feature has baked" merge. I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

It was concerning to me in terms of our entire processes to see that this went 2 and a half days without the blocking concern being brought up, and with most team members having already checked their box. Its conceivably, since we no longer require active consensus of all members, that this could have entered FCP without the blocker even being raised. This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

joshtriplett commented 6 years ago

@withoutboats Yes, exactly. This makes me rather less inclined, in the future, to accept things on a "we'll XYZ during the stabilization process" basis, until we have some process in place that makes it exceedingly unlikely that'll be missed.

Centril commented 6 years ago

@withoutboats

I'm not blaming @Centril for this, but a process breakdown: if we're going to accept features tentatively with unresolved blockers on stabilization, we need to track those blockers.

Apologies nonetheless; I was unaware of the caveat, but I should have checked for such things nonetheless when I created the tracking issue.

This feels like a subversion of the prior agreement which led me to consense with merging the RFC, and I think its entirely caused by insufficient tracking of information.

I should have made an unresolved question for it; I believe the process mistake happened at that point. As for how to improve the process; I think it is important that unresolved questions make it to the text of the RFC; it becomes much harder to find them otherwise ;)


As for the fcp-merge proposal; I personally do think it would be useful for reasons of uniformity and for use by macros. However, if you believe it is too early to propose stabilization; feel free to cancel the proposal :)

joshtriplett commented 6 years ago

@est31

I'd otherwise have to use a loop with an unconditional break statement,

Or restructure the code to avoid either.

I pointed out my use case here.

Why not replace the break 'pseudo_return with return Ok(vectors);?

SoniEx2 commented 6 years ago

as I've mentioned here, this is useful for hand-written parsers (even without labeled-propagate (try_foo() 'bar?)).

label-break-value allows the easy imperativeization of otherwise functional code. generally, imperative code tends to be more readable than functional code.

est31 commented 6 years ago

Or restructure the code to avoid either.

Of course, Rust is turing complete. But restructuring might be a bit difficult. Overall, you can reject almost every sugar feature (and this is sugar feature) on the basis "you can just use the existing ways".

Why not replace the break 'pseudo_return with return Ok(vectors);?

Actually in this instance you are right, one can replace the break with a return Ok. But in other instances you might want to do processing afterwards, etc. The borrow checker works poorly across function boundaries, you can't factor every single such block into a function.

Anyways, I've broken my silence about commenting on language features through official means, and tbh I'm regretting it. All the same points, rehashed over and over again. This shit is a waste of my time, sorry. So don't expect any further comments from me.

joshtriplett commented 6 years ago

@est31 I really appreciate you providing details; thank you.

SoniEx2 commented 6 years ago

There's an accessibility issue to testing and using these things, due to the requirement of nightly Rust.

I target stable. I hope this can be solved some day.