rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.72k stars 12.5k forks source link

tracking issue for default binding modes in match (RFC 2005, match_default_bindings) #42640

Closed nikomatsakis closed 5 years ago

nikomatsakis commented 7 years ago

This is a tracking issue for the "match ergonomics using default bindings mode" RFC (rust-lang/rfcs#2005).

Status: Awaiting stabilization PR and docs PR! Mentoring instructions here.

Steps:

Unresolved questions:

nox commented 6 years ago

I hope this RFC isn’t going to be stabilised any time soon, given no large project used it yet in anger, and given @petrochenkov mentioned earlier that he felt it was making ownership of values way less obvious.

Could someone write a tool to patch existing code to see what impact it would have on Servo?

I don’t think any of these changes should land without being tried out in Servo, after all this is why Servo doesn’t use stable.

pcwalton commented 6 years ago

This is similar to how &/&mut is inferred in closures. We used to have capture clauses and it turned out to not matter at all when they were eliminated. I don't really share the concerns.

nox commented 6 years ago

I do not see how this is similar, a closure will capture as &mut only if the value was already mutable or a &mut, it doesn't make bindings that are not mutable suddenly mutable.

pcwalton commented 6 years ago

Though I do agree that we shouldn't stabilize until big projects have used this.

nikomatsakis commented 6 years ago

Some minor annoying detail: if the binding mode is inferred to be by-reference, then "small" values like integers are bound by reference as well and you have to add dereferences.

Yes, this was an anticipated outcome. Our planned way to address this was through https://github.com/rust-lang/rust/issues/44619 -- basically, by making it so that it matters less if you have a &u32 vs a u32 (that distinction, in general, is kind of annoying, and arises in a number of other contexts -- e.g., for i in &slice { .. }, or slice.iter().filter(|i| ...)).

nikomatsakis commented 6 years ago

@nox

It seems that the RFC says that rustc may end up using a ref mut binding mode for values on which &mut methods are called, am I reading this wrong?

I believe you are. The binding mode does not depend on how the value is used: only on the type of the value being matched, and the pattern used to match it.

It is certainly possible though to get a ref mut binding mode:

fn foo(x: &mut Option<Vec<u32>>) {
    if let Some(v) = x {
        v.push(22); // modifies caller's vec
    }
}

Similar things are possible today. For example, if I swap the type of the input to Option<&mut Vec<u32>>:

fn foo(x: Option<&mut Vec<u32>>) {
    if let Some(v) = x {
        v.push(22); // modifies caller's vec
    }
}

In general, the operation of match is meant to be analogous to how into_iter works today. So indeed I can write the following function today, which uses the original argument type &mut Option<Vec<u32>>) but paired with a for loop, and also mutate the caller's vector in an analogous way:

fn foo(x: &mut Option<Vec<u32>>) {
    for v in x {
        v.push(22); // modifies caller's vec
    }
}

given @petrochenkov mentioned earlier that he felt it was making ownership of values way less obvious

I believe that @petrochenkov's exact words were that they "found the result slightly harder to read and to figure out what happens with ownership" (emphasis mine).

In general, it is true that you get somewhat less information from the patterns in terms of ownership -- but already that information could be misleading. For example, we saw above that a Some(v) pattern might mean that you own v, but it might also be that v is itself a borrowed reference (in which case, yes, you own the reference, but not the data).

I've found that this feature actually generally makes it easier for me to follow what's going on, because I can think more about "borrowed vs owned" and worry less about the particulars of the type. I quite like that Option<&mut Vec<u32>> and &mut Option<Vec<u32>> interact with matches in the same, analogous way, for example -- and I like having match work more like for.

In terms of being used with large projects, usage in the wild is always good! The feature is certainly being used in rustc, I'd like to hear about other people's experiences with it of course. (Often though waiting for the biggest projects to adopt can be tricky, since people wait to use nightly features until they are on stable.)

sgrif commented 6 years ago

Though I do agree that we shouldn't stabilize until big projects have used this.

Given that 100% of code written using this feature can be written without using this feature, I think it's a stretch to assume many large projects are going to use this.

I tried enabling this on Diesel's codebase, and replaced as much code as I could find (rg "match \*" is what I looked for) to take advantage of this feature. You can find the results here: https://github.com/diesel-rs/diesel/pull/1578

I encountered the issue that @petrochenkov described in https://github.com/rust-lang/rust/issues/42640#issuecomment-368163763 3 times. In all 3 cases, I opted out of this feature by continuing to match *expr. I did find it surprising that this occurred.

I would have thought that this feature would never infer ref for types which are Copy. While it's stumbling block that is smaller than our current situation, I do still think it's a stumbling block. While it's less common and less intrusive, I also think it's less clear why it occurred than the existing rules. Perhaps stabilizing this should at least wait until #44619 has been explored, and we're confident that we will be able to remove this gotcha backwards compatibly in the future.

That said, overall this enabled me to remove meaningless punctuation and refs from the codebase. Other than the cases where I wanted a copy, this feature let me say what I mean. I disagree that this somehow obscures ownership semantics. match *foo { Bar { ref baz } => qux(&**baz) } is not exactly clear to begin with.

scottmcm commented 6 years ago

It seems like we'd always have other explicit options (brainstorming: match foo { Bar { move baz } => or match foo { Bar { baz: i32 } =>), so I don't think we need to wait on #44619 here.

nikomatsakis commented 6 years ago

@sgrif Thanks for that exploration! Very helpful.

Perhaps stabilizing this should at least wait until #44619 has been explored, and we're confident that we will be able to remove this gotcha backwards compatibly in the future.

I was wondering about this. It might make sense -- that said, I don't particularly like the idea of having special rules around Copy types, particularly since just because something is Copy doesn't mean that you want to copy it (it could be large, or be something like a closure or copy type that is Copy). If we went that route, I think we would really need to have a distinct between "pod" (can be memcopied, but has its own identity and shouldn't be) and "copy" (no identity at all). We decided some time ago not to do that, for better or worse, and I think it'd be hard to back out from it now... though I can't precisely remember why we felt like this was a 1.0 issue that we couldn't change later... I guess because it would alter the trait hierarchy in ways that are not trivially backwards compat.

rfcbot commented 6 years ago

The final comment period is now complete.

leoyvens commented 6 years ago

Also afaik implementing Copy (and traits in general) is backwards compatible, we probably need autoref to be able to do stop inferring ref without breaking that property.

nikomatsakis commented 6 years ago

OK, there were a few concerns raised in the FCP. We discussed in the lang-team meeting and decided to go forward with the stabilization, but I wanted to record the concerns and the responses to them for posterity:


@nox pointed out that this feature allows ref mut bindings when matching against &mut values (by design). This means that -- compared to today's matches -- there is "one less use of mut" to create a mutable reference that affects other variables. However, as I described here, there is precedent for this in for loops and match statements in other, similar scenarios.

Similarly, @petrochenkov noted that using this feature sometimes makes it less clear where borrowing is happening, but others reported opposite experiences.


@nox raised the concern that the feature has not been used "in the wild" enough, but there are a number of experience reports, including an "at-scale" deployment on Diesel, so we feel comfortable going forward.


@petrochenkov pointed out that the implied ref bindings are annoying around copy types. This is certainly true, and was known in advance as a shortcoming of the RFC. However, that problem too arises in other contexts (e.g., foo.iter()), and the consensus from the RFC discussion was that the right way to handle this is by tackling the &i32 -> i32 problem head on. Moreover, as @leodasvacas pointed out here, having rules that depend on whether the matched value is Copy would interact poorly with semver.


Thanks all!

nikomatsakis commented 6 years ago

I'm adding the E-mentor tag... we are in need of a stabilization PR here! Instructions can be found on forge, and I'd be happy to help with any other concerns.

In addition, we will need to update documentation: cc @rust-lang/docs, any recommendations on what to do there?

steveklabnik commented 6 years ago

In addition, we will need to update documentation: cc @rust-lang/docs, any recommendations on what to do there?

I'll be opening a pre-rfc in the next few hours; tl;dr don't worry about it. I'll make sure it won't get lost.

Dylan-DPC-zz commented 6 years ago

@nikomatsakis HI. Can work on this :+1:

nikomatsakis commented 6 years ago

@Dylan-DPC great! You basically want to search through for the feature flag match_default_bindings, as described here.

kennytm commented 6 years ago

Note that there is still an ICE involving match_default_bindings: #46197.

nikomatsakis commented 6 years ago

Oh yeah, forgot about that. We should fix it.

cramertj commented 6 years ago

@Dylan-DPC Hey! Have you had a chance to look at this? 1.26 is supposed to branch soon, so if we want to get match_default_bindings as part of this release we need to have a PR open very soon. No pressure on you if you're too busy-- I just wanted to check in.

ErichDonGubler commented 6 years ago

@steveklabnik: Did you still want help with documenting this? I might be interested in contributing. :)

Dylan-DPC-zz commented 6 years ago

@cramertj haven't started it yet. will do this week

leoyvens commented 6 years ago

@cramertj Isn't it better to stabilize language features right after a new beta rather than right before? This way there is more time to fix potentially unnoticed flaws by testing it in the wider nightly ecosystem.

cramertj commented 6 years ago

@leodasvacas It is the case that the more time we have to spend with an unstable feature, the more time we have to find bugs in it. However, this approach is in opposition to our other goal of shipping features. This is maybe a bit of a blunt way to put it, but there's long been consensus that this feature is ready for stabilization, and I'd like to move forward on that basis.

@Dylan-DPC Thanks! However, I've gone ahead and made a PR for this myself, since we need to get it in today in order to ship as part of 1.26. If you're looking for other opportunities to contribute, there's lots of good issues under the E-mentor tag!

Dylan-DPC-zz commented 6 years ago

@cramertj No issues. Yah the labelling helps a lot :)

steveklabnik commented 6 years ago

Did https://github.com/rust-lang/rust/pull/49394 close this issue?

ErichDonGubler commented 6 years ago

@steveklabnik: Looks like this was stabilized, but the as-yet-unchecked items in "Unresolved questions" were punted to be fixed after shipping binding modes first?

Source: https://github.com/rust-lang/rust/pull/49394#issuecomment-376326024 and the response.

Boscop commented 6 years ago

Before we stabilize match_default_bindings, can we talk about this? https://github.com/rust-lang/rust/issues/50008

nikomatsakis commented 6 years ago

@Boscop

Before we stabilize match_default_bindings, can we talk about this? #50008

This is the case of copy data -- and yes, the current plan, as discussed in this summary -- is to improve this case via a coercion from &usize to usize.

Your suggestion, I believe, is to make &P patterns, when machting a non-reference type and in a "default binding mode" of ref, go back to a default binding mode of move. This way for (&a, b) in &vec would "copy out" from a but take b by ref; the fully explicit pattern would be for &(a, ref b) in &vec. I confess I have often tried to edit patterns in just the way you describe, due I think to habit in writing things like for &i in &vec. Still, it seems confusing (to me) to have & "reset' the default binding mode, rather than pairing off with a &T type -- and there is also the common belief among newcomers that & should mean what "ref" means to contend with (that is, when first reading your example, I thought you were attempting to do the opposite of what in fact you are attempting). It was also considered to add a move binding mode, which would allow one to write for (move a, b) in &vec, but that was confusing too.

So in short I still think that the coercion is the best approach.

rpjohnst commented 6 years ago

I am absolutely floored by the fact that default binding modes don't already work like @Boscop expects. That (apparently incorrect) understanding was part of why I came around to liking them.

Further, the &usize -> usize coercion was fairly controversial. It was kind of swept up in a larger umbrella issue https://github.com/rust-lang/rust/issues/44619, which notably does not include any work on implementing or experimenting with such a coercion. (It does include the less-controversial auto(de)ref on operators only, which already applies to methods, and the opposite coersion of autoref.)

So given that several of us keep running into the desire for & to "reset" the binding mode, I don't think the difference between that and "pairing with a &T" is all that confusing. You can easily think of default binding modes as transforming the types you're matching on, such that you are "paring with a &T." After all, a bare name does wind up having &T as its type.

Both "& resets the binding mode" and "auto(de)ref for operators" feel very economical- like they're filling in a hole and fulfilling a promise made by the language. Adding a more general &usize -> usize coercion doesn't. I would really like to see at least some experimentation with "& resets the binding mode."

Boscop commented 6 years ago

Both "& resets the binding mode" and "auto(de)ref for operators" feel very economical- like they're filling in a hole and fulfilling a promise made by the language. Adding a more general &usize -> usize coercion doesn't. I would really like to see at least some experimentation with "& resets the binding mode."

100% agree with this. Also, if &T constituents are basically coerced into T if T: Copy, that would lead to inconsistencies when it's a &mut T. That one should not be auto coerced to T (because one might want to mutate it) but (when not mutating) it should still be possible to move it out explicitly by writing &x (instead of the more verbose &mut x. That would be the most economic and consistent / intuitive solution imo. (Using the move keyword here would feel very inconsistent and counter intuitive.)

Boscop commented 6 years ago

default match bindings are already magic and auto coercion would make it even more confusing for newbies than explicitly moving out.

I confess I have often tried to edit patterns in just the way you describe, due I think to habit in writing things like for &i in &vec. Still, it seems confusing (to me) to have & "reset' the default binding mode, rather than pairing off with a &T type

It feels natural to try &x to move it out for experienced Rust users.

Btw, in my experience, any kind of magic that makes code more concise for experienced users obfuscates the code more for newbies, because magic is by its nature a special-cased introduced inconsistency that has to be learned in addition to the already complex normal set of rules. This is very apparent when math professors use a math notation that uses a lot of syntactic sugar to be more concise, rather than explicit. When learning, it's better to see all the details spelled out to internalize them, before moving on to hide them under syntactic sugar. Otherwise the understanding will only be cloudy.

As someone who knows a complex subject (like Rust) it's very difficult to imagine how someone who doesn't yet know it processes the information because it's become so second nature..

But I remember from math classes in uni that people seemed to understand the concepts better without syntactic sugar.

So while I think there is some merit to magic like default match bindings, I would not recommend anyone to use this feature who hasn't yet 100% understood the whole borrow checker and the difference between ref and &. Imo it's mostly useful for experienced users who want to write more concise code, but it doesn't make (this part of) Rust easier to learn for newbies.

And I wish there was more A/B testing and experiments by supervising people who are learning Rust before making decisions to change the language with the goal of making it easier to learn to confirm that those changes actually make it easier to learn...

So I think we should 1. recommend newbies not to use default match bindings but being explicit, 2. when using default match bindings, we should use &x to move out constituents because for experienced users it makes the most sense.

eddyb commented 6 years ago

I really like &Ctor(...field_pattern) and Ctor(...&field_pattern) being equivalent, which makes &Ctor(...ref field_var) and Ctor(...&ref field_var) (i.e. Ctor(...field_var)) equivalent. Mixing subpatterns with and without & chooses between "by-value" and "by-ref". cc @nikomatsakis

cramertj commented 6 years ago

Yeah, in fact this is the behavior I initially thought @nikomatsakis implemented here but then realized that the PR had taken a more conservative approach. I'm totally in favor of this change.

Boscop commented 6 years ago

@eddyb Yes, that consistency is why &x would make a lot more sense than the move keyword, or not even allowing this. And when I first heard of "default match bindings" for ergonomics, I assumed it would exhibit this symmetry. It would make "default match bindings" scalable to even deeply nested patterns by allowing fine grained control of which constituents one wants to move out.

@cramertj It seems in that thread you meant the same thing I was expressing above, which is another indication that this would be more intuitive / expected behavior (for experienced Rust users).

joshtriplett commented 6 years ago

My apologies if this message is redundant; I'm trying to catch up on the current state of things and I think I've missed some context across a few different threads, so I'm seeking some clarification.

If I understand the current state correctly, matching on an expression that has a reference type will allow you to write a pattern that matches out what looks like a non-reference but automatically becomes a reference rather than giving a type error? For instance, you can match on &Foo(expr) with a pattern like Foo(name) and get out a name that's a reference rather than a value?

But matching an expression that has a reference type will never let you extract a non-reference from it, and vice versa, matching an expression that has a non-reference type will never let you extract a reference from it (without writing ref)?

Is that accurate? And is there an intention to change that, or are we currently committing to not changing that? In particular, I think I'm confused by the same comment from @nikomatsakis that @cramertj referenced.

If that's accurate, then while I still wish we didn't have this at all, at least that's only breaking my internal typechecker but not actually making it easier to write code that has unexpected costs or correctness concerns.

petrochenkov commented 6 years ago

Looks like match_default_bindings was released as is on stable channel just 3 hours ago (https://github.com/rust-lang/rust/pull/50510).

withoutboats commented 6 years ago

But matching an expression that has a reference type will never let you extract a non-reference from it

This is not correct. It works as it always has - you destructure the reference with an &Foo pattern. Other people are talking about inserting a reference character where there isn't actually a reference type.

joshtriplett commented 6 years ago

@withoutboats Sorry, I should have phrased that more carefully. I meant "without specifying the &. So, for instance, you could match &x with a pattern of &y and end up with a non-reference y, but you couldn't match &x with a pattern of y and end up with a non-reference y. Right?

Boscop commented 6 years ago

@joshtriplett

If I understand the current state correctly, matching on an expression that has a reference type will allow you to write a pattern that matches out what looks like a non-reference but automatically becomes a reference rather than giving a type error? For instance, you can match on &Foo(expr) with a pattern like Foo(name) and get out a name that's a reference rather than a value?

Yes, that's with the match_default_bindings feature that was just stabilized.

But matching an expression that has a reference type will never let you extract a non-reference from it, and vice versa, matching an expression that has a non-reference type will never let you extract a reference from it (without writing ref)? Is that accurate?

Yes.

And is there an intention to change that, or are we currently committing to not changing that? In particular, I think I'm confused by the same comment from @nikomatsakis that @cramertj referenced.

AIUI, cramertj's, eddyb's and my intention is: When matching a &Foo(a, b, c, d) with a pattern like Foo(a, b, c, d) (where the constituents will be implicitly ref) to allow switching certain constituents to move out instead, by writing e.g. Foo(a, b, &c, d), which would still make a, b and d references but move c out instead (copy it). This would be analogous to e.g. writing for (i, &x) in v.iter().enumerate() { to cancel out the reference. Currently it's not possible to cancel out the "implicit ref" for certain constituents (which leads to confusion when trying it, because experienced Rust users would expect it to work). Currently, when matching a &Foo(a, b, c, d) with a pattern like Foo(a, b, c, d), all constituents can only be bound by "implicit ref". The proposed change would make match_default_bindings behavior more intuitive and more scalable to larger/nested patterns.

bstrie commented 6 years ago

Does the stabilization in the recent 1.26 release represent a complete implementation of RFC 2005? If not, what remains? If so, then shall we move discussion of further extensions into a new RFC?

nikomatsakis commented 6 years ago

@bstrie I believe this is all "new RFC" material, yes.

nikomatsakis commented 6 years ago

(And I think it is a backwards compatible extension)

eddyb commented 6 years ago

I'm not entirely sure this is backwards-compatible, because you can use Some(&x) currently to match &Option<&T> and it results in x: T, whereas the proposal would make x: &T.

rpjohnst commented 6 years ago

That's deeply unfortunate, how did we wind up with that? 😱

cramertj commented 6 years ago

I think we can avoid that by only performing the binding mode switch back to by-move when matching a non-reference with a reference pattern.

cramertj commented 6 years ago

Basically, just like what we did with the first iteration of default-match-binding-modes, we only change the case that doesn't compile at all today.

joshtriplett commented 6 years ago

@cramertj I'd like to suggest that "changing a case that doesn't compile to now compile" is not a heuristic we can or should apply universally. Sometimes it's the compiler's job to refuse to compile something and complain, so that code gets fixed.

In this particular case, if it works in some cases but not in others, and the demarcation between the two isn't extremely clear, or the behavior isn't consistent (which the case @eddyb mentioned suggests it wouldn't be), then that seems worse than simply giving an error about mismatched types. Just from @eddyb's description, I already feel like I can no longer successfully predict how matching will work in cases that involve references.

leoyvens commented 6 years ago

@rpjohnst It's the exact behaviour specified on the RFC.

rpjohnst commented 6 years ago

@leodasvacas I mean, we already had a fairly large misunderstanding about what exactly the RFC specified here. In fact I'm still not at all sure why it behaves this way. That is, here's my interpretation of the RFC, in which the second case doesn't match the implementation:

match &Some(&0) {
    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern; deref + switch to ref; now looking at &0
         x // reference pattern; mode stays ref; was looking at &0 so x: &&i32
          ) => ...

    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern, deref + switch to ref; now looking at &0
         & // reference pattern; mode stays ref; now looking at 0
          x // reference pattern; mode stays ref; was looking at 0 so x: &i32
           ) => ...
}

Or the desugaring:

match &Some(&0) {
    &Some(ref x) => /* x: &&i32 */
    &Some(&ref x) => /* x: &i32 */
}

This desugaring for the second case here (i.e. @eddyb's example) does give x: &i32, so that's clearly not what the implementation is actually doing. Where does the implementation's interpretation of the RFC differ from mine, and why? This looks like a bug to me right now.

sgrif commented 6 years ago

Is there any way to turn this off within a codebase? For libraries that support <1.26, this is going to be a pain point. CI ensures that we don't accidentally rely on it, but it'd be great if there were a way to make sure that builds fail locally for new contributors regardless of which Rust version they're using. Otherwise we'll just continue to get an increasing number of PRs which depend on this feature only to fail on CI