Open folkertdev opened 4 weeks ago
In the given example, break 'foo Some(42);
could be replaced by Some(42)
. If the goal is to be able to break from a more nested block, maybe a general break 'NAME VALUE
, not specific to match
, could be used on arbitrary statements or blocks – the 'NAME
would stay mandatory to preserve the semantics of break 'VALUE
which exits the innermost loop.
Concerning the state machines implementation mentioned as a motivation, wouldn't that also be covered as part of the explicit tail call RFC? Both RFC propose to change the language syntax, but the "explicit tail call RFC" seems more generic as it covers any case of tail call optimization. However, it requires the state machine to be in a standalone function, or at the beginning of a function.
In the given example,
break 'foo Some(42);
could be replaced bySome(42)
. If the goal is to be able to break from a more nested block, maybe a generalbreak 'NAME VALUE
, not specific tomatch
, could be used on arbitrary statements or blocks – the'NAME
would stay mandatory to preserve the semantics ofbreak 'VALUE
which exits the innermost loop.
You're right, in this case the break is there because I needed some way to highlight that break
can also target a labeled match.
Concerning the state machines implementation mentioned as a motivation, wouldn't that also be covered as part of the explicit tail call RFC? Both RFC propose to change the language syntax, but the "explicit tail call RFC" seems more generic as it covers any case of tail call optimization. However, it requires the state machine to be in a standalone function, or at the beginning of a function.
The section on guaranteed tail calls has some detail on why tail calls are useful (rust should have them!) but not always the best solution:
Also in terms of ergonomics labeled match and guaranteed tail calls make very different tradeoffs, that work well in some cases and less well in others.
I'll be honest, I really don't like this. Having a match that can implicitly act like a loop is very confusing, although I understand the desire to simplify this kind of construct.
I would much prefer if this were some kind of dedicated construct like loop match
, rather than just an ordinary match statement. loop match
to me would not need explicit continue
s, replacing all expressions with continue
s until it hits a break
. Having a match which defaults to break
s but still allows continue
s is just confusing.
And just adding onto this: my main issue is that any statement should be immediately obvious whether it loops. So, seeing a large match
, label or not (note: there can be labeled blocks, and maybe eventually labelled statements, so, the label is not a key giveaway), that can loop back on itself is very confusing. You should have to explicitly say that it can loop if you want looping.
I agree that "a match can be a loop now" is the most unintuitive part of this proposal. I think any proposal tackling this problem will not be entirely obvious at first though.
This feature is unlikely to show up early on in a beginner's journey (e.g. how often do we really see a labeled block in the wild?). The requirement of a label on both the match
and break
/continue
tries to make the logic, and that something special is going on, as explicit as possible. In combination with the guide, I believe that should be sufficient to get folks unstuck.
The RFC specifically uses labeled loop/block syntax and intuition to make the feature as non-invasive as possible. Expanding the language further is something I'd like to avoid if possible, but I'm open to it if the lang team prefers that route.
See, the thing is that labelled blocks can't loop. The only thing that loops are loops.
I get not wanting to "change the language too much" but I think that subtle syntax with massive changes in functionality are a much bigger change than obvious syntax. It requires people to expand their mental model substantially more when reading Rust code, whereas something like loop match
does not.
Plus, it's very easy to imagine making 'loop: match x { ... }
simply a shorthand for 'loop: { match x { ... } }
(I believe there's even an autofix for this) and this version could not support continue
s with values in the same way.
I agree with @clarfonthey that this syntax looks unecessary, including that loop match
must trash the result of the match
.
Yet if folks want this then the systax should be loop break match { /* match body using continue */ }
because
let mut tmp = initial;
loop { break match tmp {
// match body using assignements plus continue
} }
I am not so sure this actually proposes to just add labeled matches.
I think it proposes much more: to ascribe a very specific evaluation order to patterns. Otherwise I don't think it works, because without the order being very strict, I don't think the jump-to-block-like control flow works correctly. And currently we don't have a well-specified order: https://github.com/rust-lang/reference/issues/1665
@folkertdev "This makes the language more complex" is "free space" in terms of drawbacks. You may wish to include, rather, that it is possible that adopting this proposal may require us to forfeit possible ways to evaluate patterns that may be more optimal in some cases. Especially ones that are not zlib's use-case.
In other words, there is not yet evidence that this cannot negatively impact the optimization of programs that don't use this feature.
- a
match
can be labeled:'label: match x { ... }
I continue to think that adding labels in more places with special behaviour like this is overall bad. Rust syntax is best when there's a "here's what's coming" warning up-front, which is why it's good that loop
is there at the beginning of code that's about to loop. A label doesn't do that, especially since today it's impossible for a label to by itself allow code to run multiple times that couldn't already -- a label allows leaving a block early, but only loops allow re-running it.
If the goal is
should make it possible to write efficient state machines
then my suggestion is still that we should have a dedicated state machine construct, not try to force it into a match
, especially since
Complex state machines are niche
and thus there's no need to have particularly-terse notation for it.
Spitballing: what about some builtin macro like
finite_state_machine! {
'start: {
continue 'two;
}
'two: {
continue 'three;
}
'three: {
break Some(42);
}
}
where the blocks can, like let
-else
, insist on being !
-typed so you have to end with continue
/break
/return
/etc?
EDIT: Aha, finally found one of the recent zulip threads about this, https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Fallthrough.20in.20Match.20Statements/near/474070351
(That example is an uninteresting use of the feature, but it's the same as the one in the RFC summary.)
Then it's clear up-front that this is weird stuff that you have to read differently from normal blocks, there's a obvious wall we can use to say which labels you're allowed to use for this, etc.
And there's no _ => None
like in the summary example from the RFC that was completely unreachable, so I didn't understand what it was there for.
I think I'm basically echoing what @clarfonthey said above, so I'll repeat it for emphasis
I think that subtle syntax with massive changes in functionality are a much bigger change than obvious syntax
New syntax is fine. We have identifier space reserved to do it non-breakingly in current editions, then change to something else later if needed. But if it's rare, we don't necessarily even need to do that.
So I completely disagree with the RFC's phrasing of
no new keywords or syntactic constructions are needed
as an advantage. It's much easier to learn something with a macro.finite_state_machine.html
page in the docs, say, than something that you don't even know what to search for since you don't know what it's called.
EDIT: For clarity, I do think something in this space would be a good thing for Rust to have. I just think that 'foo: match
is the wrong way to spell it.
@workingjubilee
I think it proposes much more: to ascribe a very specific evaluation order to patterns. Otherwise I don't think it works, because without the order being very strict, I don't think the jump-to-block-like control flow works correctly. And currently we don't have a well-specified order: rust-lang/reference#1665
The idea is that we do at compile time what would be done at runtime anyway: compute the next location to jump to. The process is short-circuited, but not changed. What suggests to you that evaluation order would change?
Also strictly speaking the specifics of the codegen approach are not part of the RFC itself, but they seemed important to talk about anyway because it's at least half of the justification for the feature (the other half being improved ergonomics for state machines and translating C-style code)
@folkertdev "This makes the language more complex" is "free space" in terms of drawbacks. You may wish to include, rather, that it is possible that adopting this proposal may require us to forfeit possible ways to evaluate patterns that may be more optimal in some cases. Especially ones that are not zlib's use-case.
Again, I don't think the evaluation order is changed at all. The only change is that a double jump between blocks a -> b -> c
is short-circuited to a -> c
when possible.
In other words, there is not yet evidence that this cannot negatively impact the optimization of programs that don't use this feature.
Any code not using labeled match would be unaffected. When you use labeled match, you get the codegen it has. Also I can't think of any cases where an unconditional jump is worse for performance (though there might be cases where other optimizations get missed? Seems unlikely to me but theoretically possible).
loop match
is cute, I like it.
I think it proposes much more: to ascribe a very specific evaluation order to patterns. Otherwise I don't think it works, because without the order being very strict, I don't think the jump-to-block-like control flow works correctly.
Hm, kinda. If the new state is fully specified, then jumping to the right block is a straightforward optimization on top of the obvious desugaring. If the scrutinee is partially known then optimizing the desugaring may indeed not work reliably:
loop match (&0, 0) {
(0, 1..) => continue (foo(), 0),
(_, 0) => break,
_ => unreachable!()
}
// desugars+compiles to:
let mut state = (&0, 0);
loop {
if *state.0 == 0 {
if state.1 >= 1 {
state = (foo(), 0);
continue;
}
}
if state.1 == 0 {
break;
}
unreachable!()
}
we can't optimize the continue
to break
directly here, because there's an intermediate dereference that (afaik) we can't optimize away.
But we don't need to compile it to a desugaring: we could say "if the continue
expression is sufficiently precise to match only one arm of the loop match
, we will compile it to a direct jump to that arm. If not, all bets are off". Then it's "just" down to clever MIR lowering.
What I don't know is how to specify "the continue
expression is statically known to have a given value". Do we allow continue CONSTANT
? continue if CONSTANT { Some(1) } else { Some(2) }
? let x = 4; foo(); continue Some(x)
? I don't know where to draw the line.
We discussed this a bit with @traviscross, and found a sensible line: continue <expr>
could be guaranteed to jump directly to the right arm if expr
is const promotable, i.e. the same expressions that can be hoisted into a static (those such that &<expr>: &'static _
). This makes it into a kind of constant-time-computed goto.
Then we can lint jumps that aren't of this form, either because the expression is not const-promotable or because it doesn't suffice to know which arm will be taken (e.g. if there's an arm guard).
I didn't see it discussed, but if we implement this as a MIR lowering property the borrow-checker would understand these direct jumps. This makes the feature pull more weight imo.
I don't think that's good enough for applications, I would want continue State1(x, y)
to also jump to the right arm.
continue <expr>
could be guaranteed to jump directly to the right arm ifexpr
is const promotable
so then if I were to try to use this to write an interpreter it would have to be something like:
macro_rules! make_interpreter {
(
@variants $variants:tt $label:lifetime: loop match ($next_insn:expr) {
$(
$Variant:path => $block:block
)+
}
) => {
$label: loop match $next_insn {
$(
$Variant => {
$block
make_interpreter! {
@variants $variants continue $label $next_insn
}
}
)+
}
};
(
$label:lifetime: loop match ($next_insn:expr) {
$(
$Variant:path => $block:block
)+
}
) => {
make_interpreter! {
@variants($($Variant,)+) $label: loop match ($next_insn) {
$(
$Variant => $block
)+
}
}
};
(
@variants($($Variant:path,)+) continue $label:lifetime $next_insn:expr
) => {
match $next_insn {
$($Variant => continue $label $Variant,)+
}
};
}
#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)]
#[repr(u8)]
pub enum Opcode {
Add,
Const,
Ret,
JumpIfNotEq,
}
#[repr(C)]
pub struct Insn {
pub opcode: Opcode,
pub args: [u8; 3],
}
pub unsafe fn interpreter(mut pc: *const Insn, regs: &mut [u64; 16]) {
macro_rules! reg_a {
() => {
regs[((*pc).args[0] >> 4) as usize]
};
}
macro_rules! reg_b {
() => {
regs[((*pc).args[0] & 0xF) as usize]
};
}
macro_rules! reg_c {
() => {
regs[((*pc).args[1] >> 4) as usize]
};
}
macro_rules! imm_i16 {
() => {
i16::from_ne_bytes([(*pc).args[1], (*pc).args[2]])
};
}
unsafe {
make_interpreter! {
'a: loop match ((*pc).opcode) {
Opcode::Add => {
reg_a!() = reg_b!().wrapping_add(reg_c!());
pc = pc.add(1);
}
Opcode::Const => {
reg_a!() = imm_i16!() as u64;
pc = pc.add(1);
}
Opcode::Ret => {
return;
}
Opcode::JumpIfNotEq => {
if reg_a!() == reg_b!() {
pc = pc.offset(imm_i16!() as isize);
} else {
pc = pc.add(1);
}
}
}
}
}
}
Just to clarify since I think it's worth pointing out: I think that a looping match construct in general would be useful, and not just some dedicated state machine syntax. Sure, state machines would be the most common usage, but I can think of others too, like simplifying expressions or traversing trees. I've had plenty of cases where pattern matching is the top level part of a loop and I think it's a useful tool to have.
Why doesn't an MIR optimization for the pattern suffice. This looks like something which could be guaranteed for any loop {match {}}
For example:
loop {
match op {
Op::Op1 => {
// do work
op = Op::Next;
}
...
}
}
I think we could even guaranteed that it compiles into a computed goto on supported platforms, via MIR optimizations.
So why does this need a new language construct instead? Why can't this be done as a MIR optimization.
If you want guarantees, we could also expose an allow by default lint which would fire if this optimization wasn't done on a loop containing a match and nothing else.
Concern re: expressed optimization improvements: how is a user supposed to know if their code is better in a loop
+ match
as currently available, or this new labeled match syntax? I don't think this is something we could ever express clearly in the reference because the guidelines for when this performs better seem very specific. In other words, designing syntax around the way that the backend currently happens to lower code seems fragile.
If there are specific match
usage patterns that optimize better in the proposed form, it seems like we have a much better chance of detecting this correctly in the compiler vs. a user correctly guessing whether to use loop
+ match
or labeled match. Or provide something as an intrinsic / core::hint
function / #[cold]
-esque attribute that gives power users control over this, without requiring migration to a new syntax to see if it helps in your case.
On to the ergonomics part:
Another way to write the example today is ControlFlow
, which actually gets pretty similar to the proposed syntax:
use std::ops::ControlFlow::{Break, Continue};
fn emulate_labeled_match() -> Option<u8> {
let mut state = 1u8;
'foo: loop {
let next = match state {
1 => Continue(2),
2 => Continue(3),
3 => Break(Some(42)),
_ => unreachable!(),
};
match next {
Continue(c) => { state = c; continue; }
Break(b) => break 'foo b,
}
}
}
That achieves the ergonomics that this proposal brings: there is no update to the state needed in every arm, and choosing the control flow is terse. The second match statement could even be packaged as a macro action!(next, state, 'foo)
.
The compiler also already knows about Continue
and Break
since they are lang items, so it should be feasible to "desugar" (resugar?) the above into your proposed syntax (assuming it is possible to figure out that Continue(c)
only ever gets assigned to state
).
I think that an alternative to this RFC could be adding an easy way to turn an instance of ControlFlow
into real control flow (break
/continue
). I would much prefer to see this because the benefits become available and useful for any loops - not just those that contain a single match
statement. ControlFlow
is pretty underpowered as it is, it would generally be nice to make it more useful.
I would want continue State1(x, y) to also jump to the right arm.
Oh yeah. Hm. Then something like "we take into account any tuple/struct/enum constructors and literals syntactically present in the continue
expression, and also constants". I think it's a similar enough logic to static promotion ("only constructors and literals" iirc), I'm trying not to introduce a whole new notion into the language but it's tricky.
Why doesn't an MIR optimization for the pattern suffice. This looks like something which could be guaranteed for any
loop {match {}}
For a bunch of reasons. One is that we don't have these sort of guarantees in the language today. Also as explained previously, this is harder (and sometimes wrong) to do as an optimization if we want reliable guarantees. And doing as an optimization means we can't get borrow checker benefits which is too bad imo.
Concern re: expressed optimization improvements: how is a user supposed to know if their code is better in a loop + match as currently available, or this new labeled match syntax?
That's why I think it stands more on its own as a specific control-flow construct (with borrowck benefits) rather than an optimization. People will use it when it better expresses their intent, just like they use match
over if
today. People who care about perf will initially do that, then they'll measure and adapt accordingly.
Another way to write the example today is ControlFlow
Having guarantees in your example seems even harder than in the original proposal. How would you phrase the guarantees we want in your view?
Another point in favor of loop match
as the new language construct is that it's very search-engine-friendly. If you're a new user and you find it in code, just pasting "loop match" into google will lead you to the answer.
For a bunch of reasons. One is that we don't have these sort of guarantees in the language today. Also as explained previously, this is harder (and sometimes wrong) to do as an optimization if we want reliable guarantees. And doing as an optimization means we can't get borrow checker benefits which is too bad imo.
Note that this RFC does not get borrow checker benefits. (Nor does it claim to, based on my reading of the RFC.) This is actually one of the reasons I'm not a big fan of this particular proposal, although I very desperately want rust to gain some primitive for doing irreducible control flow. To get borrow checker benefits I think you really need a syntactic form that connects the match arm to the jump, and the proposed syntax of continue 'a <expr>
is just not direct enough IMO to be able to support borrow checking magic when <expr>
matches some specific forms. (It gets awfully close to guaranteed constant propagation (for non-constants) visible in the type system a la typescript...)
To get borrow checker benefits I think you really need a syntactic form that connects the match arm to the jump, and the proposed syntax of
continue 'a <expr>
is just not direct enough IMO to be able to support borrow checking magic when<expr>
matches some specific forms. (It gets awfully close to guaranteed constant propagation (for non-constants) visible in the type system a la typescript...)
Yeah, we should probably have a different syntactic expression to get the "direct jump" versus "do the match normally" behaviors, because if it affects borrowck this needs to be visible from reading the code. continue <expr>
is such a good syntax though, not sure what else we could use.
Note that this RFC does not get borrow checker benefits. (Nor does it claim to, based on my reading of the RFC.)
This is not something I considered (so it's not currently part of the RFC), but you get the borrow checker changes for free, because of how the MIR is generated. e.g. this errors:
enum State { Foo, Bar, Baz, }
fn main() {
let owned = Box::new(1);
let mut state = State::Foo;
loop {
match state {
State::Foo => {
state = State::Bar;
}
State::Bar => {
// or any function that moves the value
drop(owned); // ERROR use of moved value: `owned`
state = State::Baz;
}
State::Baz => {
break;
}
}
}
}
While if written using labeled match (or loop match
, whatever syntax we settle on), the CFG can never reach State::Bar
a second time:
'label: match State::Foo {
State::Foo => {
continue 'label State::Bar;
}
State::Bar => {
// or any function that moves the value
drop(owned); // all good now!
continue 'label State::Baz;
}
State::Baz => {
break 'label;
}
}
And hence this program would be accepted by the borrow checker: cool! But also whether the optimization fires can determine whether a program passes the borrow checker.
To get borrow checker benefits I think you really need a syntactic form that connects the match arm to the jump, and the proposed syntax of
continue 'a <expr>
is just not direct enough IMO to be able to support borrow checking magic when<expr>
matches some specific forms. (It gets awfully close to guaranteed constant propagation (for non-constants) visible in the type system a la typescript...)Yeah, we should probably have a different syntactic expression to get the "direct jump" versus "do the match normally" behaviors, because if it affects borrowck this needs to be visible from reading the code.
continue <expr>
is such a good syntax though, not sure what else we could use.
Other proposals in this space have suggested to use multiple labels, one per block, and I think that makes a lot more sense than marking a match expression and then having to describe an arm in it.
And hence this program would be accepted by the borrow checker: cool! But also whether the optimization fires can determine whether a program passes the borrow checker.
Right, I think this is a non-starter, optimizations cannot affect borrow checker behavior or else they are not optimizations. So either the real implementation will have to go to some extra work specifically to prevent this behavior from leaking, or else the RFC will have to "own it" and define precisely when the optimization which isn't an optimization actually kicks in.
what if you had to write static continue 'a expr
to get the jump directly to match arm semantics (and requires being able to figure out which match arm to jump to at compile time otherwise you get a compile error) and the borrow checking stuff, otherwise it behaves as though jumping to the match (and may optimize to a jump to a match arm, but that's not observable from language semantics)
Other proposals in this space have suggested to use multiple labels, one per block, and I think that makes a lot more sense than marking a match expression and then having to describe an arm in it.
I never like these proposals because labels feel like a last-resort kind of feature in rust, not a first-class one. Plus with a match you get to pass values around without having to define a bunch of mutable variables.
what if you had to write
static continue 'a expr
Yes I like this! Or const continue
given that it does something like compile-time computation.
I never like these proposals because labels feel like a last-resort kind of feature in rust
but can you express computed goto using loop match
(e.g. with an enum where its repr is a pointer to the block addresses in a function) where for branch prediction reasons you want every block to have a separate computed goto instruction?
That is needed for interpreters because CPUs can predict branches much better if every interpreted instruction kind has a separate computed goto since branch predictors use the address of the branch instruction to differentiate which predictions to give, allowing the branch predictor to learn more complex patterns since it gets more information.
Other proposals in this space have suggested to use multiple labels, one per block, and I think that makes a lot more sense than marking a match expression and then having to describe an arm in it.
I never like these proposals because labels feel like a last-resort kind of feature in rust, not a first-class one. Plus with a match you get to pass values around without having to define a bunch of mutable variables.
I tend to agree with this, I think it would look better with function-call syntax rather than using labels. But to be clear I did mean blocks with parameters here (something which would be its own RFC needing appropriate syntax). There are plenty of design proposals for this linked in the previous discussions.
but can you express computed goto using
loop match
(e.g. with an enum where its repr is a pointer to the block addresses in a function) where for branch prediction reasons you want every block to have a separate computed goto instruction?
Didn't a certain @programmerjake demonstrate above how to do this using a macro? It seems kind of orthogonal, given that there is no first class support for computed goto in this proposal.
Didn't a certain @programmerjake demonstrate above how to do this using a macro?
not really, because the macro I wrote would produce N separate jump tables for the N match arms and take O(N^2) space/time. computed goto can do all of that without any jump tables at all and doesn't need a crazy O(N^2) macro.
I never like these proposals because labels feel like a last-resort kind of feature in rust, not a first-class one.
I agree with the sentiments on macro + label solutions. Specifically, I think the connection between control flow and data flow is important. E.g. with the label solutions it is more laborious to store the current state (so that you can later resume there), because labels are not first-class.
what if you had to write
static continue 'a expr
This is great! I think I slightly prefer const continue 'label expr
but static could work too.
I think that a looping match construct in general would be useful, and not just some dedicated state machine syntax.
I beg to differ. Being able to write loop match {}
instead of loop { match {} }
is not a significant improvement in ergonomics and readability. But the state machine optimization is the main motivation of this RFC, and it is impossible to achieve in current Rust.
Moreover, I believe that the vast majority of loop { match {} }
constructs are actually state machines, and would benefit from a more explicit syntax for state machines.
When your intention is to write a state machine, a macro called state_machine!
would be much more explicit and readable than loop match
.
I've pushed a number of updates to the RFC text. The main changes are summarized below
The syntax has now been changed to loop match
, as discussed.
The standard loop match is like a loop { match { ... } }
(though can potentially be used for e.g. computed goto in the future). To convey more accurate CFG information, state transitions can be annotated with const
, e.g. this is accepted:
let x: u64;
loop const match true {
true => {
x = 42;
const continue false;
}
false => {
dbg!(x)
}
}
Because the compiler can now know that all paths that lead to the dbg!(x)
expression have initialized x
, this is valid. Removing the const
from either the loop match
or the continue
would give an error that x
may not be initialized.
const
We've opted for a restrictive, but straightforward criterion for what state transitions can be marked as const
: the expression must be eligible for static promotion as introduced in RFC 1414. Notably that excludes any partial patterns (where e.g. the outer constructor is known but inner fields are only known at runtime). That is unfortunate but the static promotion rule is simple and covers most current cases.
The const
-marked state transitions can introduce irreducible control flow. As far as we've been able to determine, that is not a problem for borrow checking. Of course, once introduced, irreducable control flow cannot be removed, so adding it should not be taken lightly.
Tweede Golf and Trifecta Tech Foundation are committed to work on improving state machine codegen, though the pace will depend on whether we can get some supplemental funding.
Some specific questions relevant to lang team review are:
I think there are compelling reasons to include it, but there is a path forward that provides just the codegen improvements without adding irreducible control flow to the surface language.
The triple-stacking of keywords in loop const match
is not the prettiest. Yet all three keywords have their function. It could be loop match const
, or loop static match
, or whatever. Syntax suggestions are easy, and everyone can contribute to discussions on syntax.
Ultimately, I'm interested in the backend changes, and not committed to a particular syntax. To me there are two particular requirements:
match
/pattern matching) is essentialThe triple-stacking of keywords in
loop const match
is not the prettiest. Yet all three keywords have their function. It could beloop match const
, orloop static match
, or whatever. Syntax suggestions are easy, and everyone can contribute to discussions on syntax.
I don't think this is a serious problem. We already have pub const unsafe extern fn
, the only problem would be if reordering the keyword soup changes semantics. If there's only one valid order we can have a helpful diagnostic, as we do for the keywords that come before fn
.
I think the proposed syntax is confusing because the argument to loop match
appears to be a constant, but can actually change.
If I see match true { false => ..., true => ... }
then I would expect the true
branch to always be taken... because it's written as a constant.
In other words, I think the syntax should hint at the fact that there is a hidden state variable being introduced.
For example:
loop match = true {
...
}
Indicates that some kind of state initialization is happening.
Rendered
This RFC adds
loop match
:loop
andmatch
can be fused into aloop match <scrutinee> { /* ... */ }
loop match
can be targeted by acontinue <value>
. Thevalue
is treated as a replacement operand to thematch
expression.The state transitions (going from one branch of the match to another) can be annotated with the
const
keyword, providing more accurate CFG information to the backend. That means:The goal of
loop match
is improved ergonomics and codegen for state machines. Rust, being a systems language, should make it possible to write efficient state machines, and currently falls short. Complex state machines are niche, but foundational to many programs (parsers, interpreters, networking protocols).This RFC follows in part from work on zlib-rs. Today, we simply cannot achieve the same codegen as C implementations. This limitation actively harms the adoption of rust in high-performance areas like compression.
Basic example
A
loop match
is similar to amatch
inside of aloop
, with a mutable variable being updated to move to the next state. For instance, these two functions are semantically equivalent:Interesting example
The real power of
loop match
lies in giving the compiler more accurate information about the control flow of a program. ConsiderReading the code, it is obvious that state moves from states
Foo
toBar
toBaz
: no other path is possible. Specifically, we cannot end up inState::Bar
twice, and hence the generated "use of moved value" error is not a problem in practice. This program is valid, but nonetheless rejected by the rust compiler.With
loop const match
andconst continue
the compiler now understands the control flow:Rendered
Tracking: