Destructuring assignment

glaebhoerl commented 10 years ago

Given

struct Point { x: int, y: int }
fn returns_point(...) -> Point { ... }
fn returns_tuple(...) -> (int, int) { ... }

it would be nice to be able to do things like

let a; let b;
(a, b) = returns_tuple(...);
let c; let d;
Point { x: c, y: d } = returns_point(...);

and not just in lets, as we currently allow.

Perhaps even:

let my_point: Point;
(my_point.x, my_point.y) = returns_tuple(...);

(Most use cases would likely involve mut variables; but those examples would be longer.)

sbditto85 commented 9 years ago

Not sure the best way to indicate a vote in the affirmative, but :+1: +1 for this

MattWindsor91 commented 9 years ago

Not sure how rust's RFC process goes, but I assume this needs to be written up in the appropriate RFC format first? I like it, mind.

arthurprs commented 9 years ago

EDIT: not so fond of it anymore

bstrie commented 9 years ago

@glaebhoerl, how do you expect this to be done? It seems to me that it would require the ability for patterns to appear in arbitrary positions, which strikes me as completely infeasible.

glaebhoerl commented 9 years ago

@bstrie I don't have any plans myself. There was some discussion of this elsewhere, possibly on the rust repository issues - I think the idea might've been that we could take the intersection of the pattern and expression grammars?

bstrie commented 9 years ago

Assuming that we took the easy route and made this apply only to assignments, we'd also need to take our grammar from LL(k) to LL(infinity). I also don't think that an arbitrarily restricted pattern grammar will make the language easier to read and understand. Finally, the only time when this feature would be useful is when you can't use a new let binding because of scope, in which case the current workaround is to use a temporary. I'm not currently convinced that the gain is worth the cost.

DavidJFelix commented 9 years ago

:+1: I've found myself wanting this from time to time, especially in reducing repetition in match statements or normal assignment. Right now I'm using small purpose-built functions instead of this. I haven't considered if it would be possible to abuse a feature like this easily or not.

tstorch commented 9 years ago

I would be thrilled if this would be implemented! Here is a small example why:

Currently in libcore/str/mod.rs the function maximal_suffix looks like this:

fn maximal_suffix(arr: &[u8], reversed: bool) -> (uint, uint) {
    let mut left = -1; // Corresponds to i in the paper
    let mut right = 0; // Corresponds to j in the paper
    let mut offset = 1; // Corresponds to k in the paper
    let mut period = 1; // Corresponds to p in the paper

    while right + offset < arr.len() {
        let a;
        let b;
        if reversed {
            a = arr[left + offset];
            b = arr[right + offset];
        } else {
            a = arr[right + offset];
            b = arr[left + offset];
        }
        if a < b {
            // Suffix is smaller, period is entire prefix so far.
            right += offset;
            offset = 1;
            period = right - left;
        } else if a == b {
            // Advance through repetition of the current period.
            if offset == period {
                right += offset;
                offset = 1;
            } else {
                offset += 1;
            }
        } else {
            // Suffix is larger, start over from current location.
            left = right;
            right += 1;
            offset = 1;
            period = 1;
        }
    }
    (left + 1, period)
}

This could easily look like this:

fn maximal_suffix(arr: &[u8], reversed: bool) -> (uint, uint) {
    let mut left = -1; // Corresponds to i in the paper
    let mut right = 0; // Corresponds to j in the paper
    let mut offset = 1; // Corresponds to k in the paper
    let mut period = 1; // Corresponds to p in the paper

    while right + offset < arr.len() {
        let a;
        let b;
        if reversed {
            a = arr[left + offset];
            b = arr[right + offset];
        } else {
            a = arr[right + offset];
            b = arr[left + offset];
        };
        // Here is the interesting part
        (left, right, offset, period) =
            if a < b {
                // Suffix is smaller, period is entire prefix so far.
                (left, right + offset, 1, right - left)
            } else if a == b {
                // Advance through repetition of the current period.
                if offset == period {
                    (left, right + offset, 1, period)
                } else {
                    (left, right, offset + 1, period)
                }
            } else {
                // Suffix is larger, start over from current location.
                (right, right + 1, 1, 1)
            };
        // end intereseting part
    }
    (left + 1, period)
}

If we apply, what is currently possible this would be the result:

fn maximal_suffix(arr: &[u8], reversed: bool) -> (uint, uint) {
    // Corresponds to (i, j, k, p) in the paper
    let (mut left, mut right, mut offset, mut period) = (-1, 0, 1, 1);

    while right + offset < arr.len() {
        let (a, b) =
            if reversed {
                (arr[left + offset], arr[right + offset])
            } else {
                (arr[right + offset], arr[left + offset])
            };
        (left, right, offset, period) =
            if a < b {
                // Suffix is smaller, period is entire prefix so far.
                (left, right + offset, 1, right - left)
            } else if a == b {
                // Advance through repetition of the current period.
                if offset == period {
                    (left, right + offset, 1, period)
                } else {
                    (left, right, offset + 1, period)
                }
            } else {
                // Suffix is larger, start over from current location.
                (right, right + 1, 1, 1)
            };
    }
    (left + 1, period)
}

This is easily more readble and I guess readbility of code is a major contribution to code safety and attracts more people to the language and projects written in that laguage.

bombless commented 9 years ago

It doesn't feel right... If you insist, I think this looks better:

introduce a, b;
let (a, b) = returns_tuple(...);
introduce c, d;
let Point { x: c, y: d } = returns_point(...);

Still doesn't feel right, but looks more reasonable.

DavidJFelix commented 9 years ago

So already @bombless this clashes for me as introduce would then become the longest word in rust.

bombless commented 9 years ago

@DavidJFelix I don't know, I'd say -1 for this assignment idea. And maybe change introduce to intro will make you feel better.

DavidJFelix commented 9 years ago

@bombless, a bit but not much. The point of "let" isn't to offer assignment, it's to introduce the variable. Assignment is done with an assignment operator, "=", If we use both the "=" and let for assignment, it becomes redundant. This is why you see:

let mut x: uint;
...
x = 123456789;

the point of this issue is that "let" allows us to unravel tuple-packed variables as we declare them and also set their value in one assignment, rather than multiple assignments; but later throughout the program, the assignment operator ceases to do this unraveling and must be done for each variable.

taralx commented 9 years ago

So there's two ways to do this. With a desugaring pass (easier) or by actually extending the implementation of ExprAssign in the typechecker and translation. The former works, but I suspect it doesn't produce as nice a set of error messages when types don't match.

Thoughts?

carllerche commented 9 years ago

I am :+1: for this too

sharpjs commented 9 years ago

:+1: Ran into this today. I'm surprised that it's not implemented already. A function can return a tuple. If I can bind that tuple via a destructuring let, it's perfectly reasonable also to assign that tuple to some bindings I already have.

let (mut kind, mut ch) = input.classify();
// ... later ...
(kind, ch) = another_input.classify();

yongqli commented 8 years ago

:+1: I would love to see this implemented.

Manishearth commented 8 years ago

Note that this means that in the grammar an assignment statement can take both an expression and a pattern on the lhs. I'm not too fond of that.

taralx commented 8 years ago

It's not just any expression -- only expressions that result in lvalues, which is probably unifiable with the irrefutable pattern grammar.

yongqli commented 8 years ago

In the future this could also prevent excessive mem::replaces.

For example, right now I have code like:

let (xs, ys) = f(mem::replace(&mut self.xs, vec![]), mem::replace(&mut self.ys, vec![]));
self.xs = xs;
self.ys = ys;

If the compiler understood the concept of a "multi-assignment", in the future this might be written as:

(self.xs, self.ys) = f(self.xs, self.ys);

Edit: Now, of course, we can re-write f to take &muts instead. However, the semantics are a little bit different and won't always be applicable.

arthurprs commented 8 years ago

@yongqli that's very interesting, thanks for sharing

flying-sheep commented 8 years ago

does this cover AddAssign and friends? would be cool to do:

let (mut total, mut skipped) = (0, 0);
for part in parts {
    (total, skipped) += process_part(part);
}

KalitaAlexey commented 8 years ago

@flying-sheep You would make this when #953 will landed.

flying-sheep commented 8 years ago

it’s already accepted, so what’s the harm in including a section about it in this RFC now?

KalitaAlexey commented 8 years ago

I mean you can do

for part in parts {
    (total, skipped) += process_part(part);
}

Edit: You cannot. Because (total, skipped) creates a tuple. To change previous defined variable you should write

for part in parts {
    (&mut total, &mut skipped) += process_part(part);
}

ticki commented 8 years ago

This is impossible with context-free grammars. In context sensitive grammars, it is entirely possible. It seems that after the ? RFC was accepted, the parser will introduce a context-sensitive keyword, catch (since it is not reserved). This makes the Rust grammar partially context sensitive (i.e. conditional context scanning). But there is one problem with doing that here: an assignment can appear in any arbitrary (with a few exceptions) position, making partial context scanning this very hard.

I doubt it is possible without making the parser full-blown context sensitive. I could be wrong, though.

flying-sheep commented 8 years ago

yeah, the &mut thing doesn’t work:

binary assignment operation += cannot be applied to type (&mut _, &mut _)

ldpl commented 8 years ago

How about adding or reusing a keyword to avoid context-sensitive grammar? For example, "mut" seems to fit well (also reflects let syntax):

let a; let b;
mut (a, b) = returns_tuple(...);
let c;
mut Point {x: c, .. } = returns_point(...);
let Point {y: d, .. } = returns_point(...);

KalitaAlexey commented 8 years ago

I don't like it.

I like

let (mut a, mut b) = get_tuple();

let SomeStruct(mut value) = get_some_struct();

let Point {x: mut x, .. } = get_point();

I don't like

let mut a;
let mut b;
(a, b) = get_tuple();

I don't like

let my_point: Point;
(my_point.x, my_point.y) = returns_tuple(...);

I'd like to write

let (x, y) = returns_tuple(...);
let my_point = Point {x: x, y: y};

I just think that code must be easy readable.

ticki commented 8 years ago

@KalitaAlexey, you can already destructure with let.

KalitaAlexey commented 8 years ago

@Ticki Can I do like that?

let SomeStruct(mut value) = get_some_struct();

let Point {x: mut x, .. } = get_point();

flying-sheep commented 8 years ago

sure. this RFC is about assignment without binding.

KalitaAlexey commented 8 years ago

@flying-sheep I don't truly understand.

ticki commented 8 years ago

@KalitaAlexey You can declare variables in a destructuring manner, but you cannot assign variables in a destructuring manner.

KalitaAlexey commented 8 years ago

@Ticki thanks. Yeah I like that.

BlacklightShining commented 8 years ago

@KalitaAlexey …why? How is (bar, baz) = foo(); less readable than let (bar, baz) = foo();? (Or, really, any different besides the former not being a declaration?)

ticki commented 8 years ago

@BlacklightShining, Declaration and assignment is very different. But the main argument here is the grammar of Rust is LL(k), which you cannot preserve with this change.

flying-sheep commented 8 years ago

yep.

Point {
    foo: bar,
    baz: ex,
    ...
}.do_thing();

is, until the last line, indistiguishable from

Point {
    foo: bar,
    baz: ex,
    ...
} = return_thing();

and mean something very different. the former grabs variables from the scope and creates a struct from them on which it then calls a function. the latter calls a function and then assigns to variables from the scope to parts of its return value.

Kimundi commented 8 years ago

Not sure if this argument has been made elsewhere already, but this could probably still be made LL(k):

Still parse the the LHS of EXPR = EXPR as an expression, keeping the LL(k) property.
Instead of giving out an early with an error: invalid left-hand side expression, keep the Expr AST/HIR/MIR/etc node around (see http://is.gd/MoTsk5 for examples where this would error today)
For each at the earliest point in the compiler passes where it becomes possible,
- check that the EXPR is of a valid "destructuring assignment" form and otherwise emit a useful error message like the current "invalid left-hand side" one.
- check that the bindings/variables mentioned in it have the right types to assign to
- translate it to code that destructures and assigns each value regularly.

This would mean not using any of the actual pattern matching parser/compiler parts, at least not inherently, but that seems fine since it wouldn't really need most of it, since it would be restricted as if it where a irrefutable pattern with only by_value bindings. And if needed it could still masquerade as pattern matching through error messages and docs.

taralx commented 8 years ago

@Kimundi That was the approach I was working on at one point, but I decided to hold off until the MIR work was done because it simplifies things greatly in this space.

bombless commented 8 years ago

Maybe we can add a syntax match PATTERN = EXPR; And I think in this way we can finally explain why we use let PATTERN = EXPR; instead of let PATTERN = EXPR in ... (that is, to match match PATTERN = EXPR; syntax)

flying-sheep commented 8 years ago

perfect. that’s also very easy to parse: "match" <expr> = vs "match" <expr> {

ticki commented 8 years ago

Honestly, I find that syntax confusing. You don't "match" the pattern. It feels like abusing match.

flying-sheep commented 8 years ago

true. match isn’t the only destructuring we have. if let also does it. that’s why i liked “match”: it matches the variant and destructures. now i’m not so sure anymore.

ticki commented 8 years ago

Yeah, but if indicates that a block follows.

flying-sheep commented 8 years ago

the other destructurings are

if let <destructuring> = <expr> <block>
match <expr> { [ <destructuring> => <expr> ],* }
let <destructuring> = <expr>

but to the binding let name = expr, the assignment name = expr is analogous

so if possible the most logical thing would be to have no keyword.

bombless commented 8 years ago

or ref let PAT = EXPR; which means you reference bindings from somewhere else

bombless commented 8 years ago

or @PAT = EXPR; since we already use @ to start a sub-pattern

bombless commented 8 years ago

just to clarify, for-in, while-let and function parameters positions also do destructuring

ticki commented 8 years ago

or ref let PAT = EXPR; which means you reference bindings from somewhere else

That seems very illogical. It has nothing to do with references.

or @PAT = EXPR; since we already use @ to start a sub-pattern

This seems very noisy.

Why not just use @Kimundi's suggestion?

bombless commented 8 years ago

kimundi's sugguestion sounds very like the old school way in which require the compiler to analysis the types it has parsed before you generate ast. I understand it's very different from that since we don't actually mix types and values here, but it still feels bad smell.

rust-lang / rfcs

Destructuring assignment #372