Closed glaebhoerl closed 4 years ago
and I doubt if it works when type projection involves
I'm thinking about a total different syntax for this.
let x;
match (1, 3) {
(!x, y) => () // you can use `y` here, as well as `x` which is from outter scope
}
// at this site `y` is out of scope but `x` is available here
this way we can designate a part of bindings in the pattern is used to do a mixed binding (that is, bound variable is actually from ancestor scope).
@bombless: The technique I described would very much not mean analysing types before the AST, on the contrary it would just parse the same way as today, with the only difference of interpreting what it parsed differently in some cases where you always get a error message today.
if i understand correctly it bail out to type check the lvalue-like expression to a pettern, as if we treat pattern as a "primary type" when we do type inference?
what I wanted to say is that pattern is not like lvalue at all so it may make AST ugly or incovinient.
anyway I guess when MIR lands the design of AST become less important
Also, do you tend to make the pattern in destructuring assignment less powerful, just like patterns in if-let, while-let are secondary class compared to match
, @Kimundi ?
Otherwise I think we have to treat all expressions as potential patterns in your solution when we parse.
No, @bombless, it requires no such thing.
@bombless: I'll repeat it one more time before giving up this particular thread of discussion. ;)
let ref mut x;
(x,) = (1,);
*x = 2;
Now I get it, so the pattern is actually in the let
?
fn main() {
let x = ref 1;
}
<anon>:2:13: 2:16 error: expected identifier, found keyword `ref`
<anon>:2 let x = ref 1;
^~~
<anon>:2:17: 2:18 error: expected one of `!`, `.`, `::`, `;`, `{`, or an operator, found `1`
<anon>:2 let x = ref 1;
^
playpen: application terminated with error code 101
I just don't see why pattern is already parsed as expression.
What I described above involves not parsing an destructuring assignment as a pattern, but as an expression - like it is today already.
Can you help me figure it out, @ticki ?
@ticki @bombless is correct -- not every destructuring assignment uses a valid expression:
(ref l[0], _) = a
It will be necessary to unify the pattern and expression grammars.
@taralx ref
is a binding modifier which binds a reference instead of the value, do you think it's actually useful in the LHS of an assignment?
Your example would mean l[0] = &a.0;
, AFAICT.
I'm more worried about slice patterns TBH.
@eddyb Maybe not useful in most cases, but you never know. People will expect it to work.
@bombless' suggestion to just reuse normal let
expressions but mark variables to refer to preexisting mut
variables seems like a smaller change with fewer complications that's nonetheless more powerful.
For example, it means we don't have to deal with unifying expressions with patterns, it means things like match
will work "for free" and it means we can do partial assignment.
My syntax suggestion would be to reuse @
syntax with a different symbol (=
is promising), so one could write
let (x = _, y = _) = { ... };
// same as the existing syntax
let (x @ _, y @ _) = { ... };
// but assigns to preexisting variables
It would have to be banned at top-level in patterns in if let
and while let
, but I can't imagine anyone using them there on purpose. It can do some nice things like
let Tree { left, right, head: minimum = _ } = ...;
match dispatch() {
last_op = Read => ...,
last_op = Write => ...,
Skip => continue,
}
The main problem is having to write = _
more frequently than otherwise, but IMO that's actually a benefit when I tried it. @tstorch's example
let (left = _, right = _, offset = _, period = _) =
if a < b {
// Suffix is smaller, period is entire prefix so far.
(left, right + offset, 1, right - left)
} else if a == b {
// Advance through repetition of the current period.
if offset == period {
(left, right + offset, 1, period)
} else {
(left, right, offset + 1, period)
}
} else {
// Suffix is larger, start over from current location.
(right, right + 1, 1, 1)
};
still looks OK to me, and it's quite clear that there are four assignments happening, which can sometimes be less apparent with the a, b, c, d =
syntax in Python (at least in my experience).
Thoughts?
I would really, really avoid reusing let
. Declaration and assignment are two entirely different things. Reusing let
would be a breaking change.
Why would it be a breaking change?
Because of scoping:
let mut a = 2;
{
let a = 3;
}
// a would be 2 using the current rustc.
// but by your proposal it would be 3.
@Veedrac Simpler example to see if I have this syntax right.
let (foo, bar);
// Stuff happens here.
let (foo = _, bar = _) = get_some_two_tuple();
This seems pretty confusing to me. First, there's the let
, even though it's assignment, not declaration. Or…does that also let you declare new variables (let (foo = _, bar = _, quux) = get_some_three_tuple();
?) Second, the = _
. I don't understand the motivation for this, and it looks like an assignment of _
(which makes no sense). @
might be a better choice, but then the _
just seems unnecessary; there's no ambiguity as to what is being assigned.
Even so, I would prefer (foo, bar) = get_some_two_tuple()
. I don't see anything wrong with that syntax, and Python and Swift already use it.
let a = 2
will still bind, not assign. You'd have to do let (a = _) = 2
, but that would be equivalent to let (a @ _) = 2
which doesn't compile.
let a @ _ = 2
does compile, but I explicitly banned top-level assinments in let
, if let
and while let
(eg. let a = _ = 2
) because they're stupid.
With your example,
let (foo, bar);
// Stuff happens here.
let (foo = _, bar = _) = get_some_two_tuple();
this is just like
let (foo @ _, bar @ _) = get_some_two_tuple();
except that foo
and bar
assigned to the preexisting values. Ergo these two are both valid:
let (foo = _, bar = _, quux) = get_some_three_tuple();
let (foo @ _, bar @ _, quux) = get_some_three_tuple();
because the latter is valid. Conceptually, you take a let (_, _)
pattern and you just add assignments inside: let (foo = _, bar = _)
. @
is already used, so using that would be a breaking change.
The reason I prefer this to (foo, bar) = ...
is mostly because it avoids introducing a new pattern syntax which is nontrivially different to the current one. My proposal requires only a very minor addition to the grammar (mostly just '@'
→ '@' | '='
, though you also need to exclude top-level bindings where it's problematic). You also don't have LL(infinity) problems.
But I also like the ability to both bind and assign, which looking back I think would be quite useful, and for it to work with refutable patterns. It'd be a pain if you couldn't use if let
when you upgrade to an Option
for whatever reason.
I think it's also worth noting that there's already precedent for using let
without binding. Consider things like
while let None = next() { ... }
IMO let
in Rust means "pattern match".
IMO let in Rust means "pattern match".
Not at all. Both if let
and while let
binds variable, not assigning them.
@Veedrac
Conceptually, you take a
let (_, _)
pattern and you just add assignments inside
This still doesn't make sense to me, particularly because nothing is actually being assigned there (_
cannot be assigned!), and even if something was, it would immediately be clobbered by the overall assignment (and thus be optimized out). I don't see the value of let (foo = _, bar = _)
over let (foo, bar)
.
Also, I thought that the proposition wasn't to add a new pattern syntax, but rather to extend certain expressions only to be usable as lvalues. I think someone already pointed out that refutable patterns are useless here: what would Some(foo) = bar
do? It works in if
and while
because branching happens based on whether it's refuted, but assignments aren't conditional like that.
@BlacklightShining
I think I may have misunderstood you before. To clarify, let (foo = _, bar = _)
does not ever declare new variables. You wouldn't do let (foo = _, bar = _);
on its own any more than you'd do let (foo @ _, bar @ _);
on its own (although that is actually valid). You'd only do this when actually assigning, like let (foo = _, bar = _) = (x, y);
.
If there isn't an assignment (so the _
is actually unbound), either it can just be outright illegal (error: variable not bound
) or it would "reset" the variable to the "uninitialized" state (aka. the same state it would be in if it was a fresh variable bound with @
). The precise behaviour here is unimportant because nobody would ever actually do that and either suggestion is coherent.
If foo
is not declared, you'd get the same unresolved
name you currently do when you write foo = bar;
without declaring it.
Refutable patterns are "useless" when one only has basic bindings, but my suggestion is an alternative to extending expressions. Extending expressions has been shown problematic to the grammar, and it introduces a sort'a-pattern pseudo-syntax. I'm suggesting that instead we extend patterns, which automatically means it works with if let
, while let
and match
s and doesn't have these complications.
Fundamentally, I do understand the objection to writing = _
. The expression form does look a little cleaner. But it's a little cleaner, not a lot, and IMO the other points I've made far outweigh that.
(foo, bar) = (x, y);
is way way way more obvious than let (foo = _, bar = _) = (x, y);
. If I saw the latter without foreknowledge, there's a good chance I'd have to ask on SO or IRC to figure out what the heck it means.
The other syntax proposed is very far away from the motivation, namely simple, obvious, and unverbose destructuring assignment.
@Veedrac if let
and while let
are declarations, though. They create a new scope. let foo = ...; if let (foo, bar) = get_some_two_tuple() { ... }
already works in stable Rust, and will continue to work if this lands. Ditto for match arms. This issue is about assignment, without creating a new scope.
I think I get this now. Despite being written as let (foo = _, ...) = ...;
rather than let (foo @ _, ...) = ...;
, the _
is meant not as an rvalue, but rather as an indication that whatever is being assigned to foo
should come from the right side of the overall statement, right? Kind of like match
?
@glaebhoerl
I'm tempted to say that Rust already acts this way; @
in patterns is certainly not obvious on first sight for instance. And I think the trivial examples make it seem less clear than it is. Consider some very fake code:
let mut state = State::Begin;
while let Some(bytes, state = _) = self.receive(state) {
... // use bytes
}
match state {
...
}
I find this somewhat self-explanatory. But I do appreciate that =
, like @
, is less discoverable and sometimes that matters most.
@BlacklightShining
Obviously I'm worse at explaining things that I thought. Perhaps this simple translation procedure should help. Basically, replace something like
let (foo = _, bar = (_, _)) = ...;
with
let (temp1 @ _, temp2 @ (_, _)) = ...;
foo = temp1;
bar = temp2;
As such, _
is indeed not an rvalue but part of the pattern, just like with @
. I'm not suggesting changing what if let (foo, bar) = get_some_two_tuple()
does.
Some more examples:
if let Ok(thing = Some(_)) = get() {
use(thing);
}
changes to
if let Ok(temp @ Some(_)) = get() {
thing = temp;
use(thing);
}
And
match get() {
Some(arg = 1) => { use(arg); },
Some(arg = 2) => { use(arg); },
Some(arg = _) => { use(arg); },
None => { continue; },
}
changes to
match get() {
Some(temp @ 1) => { arg = temp; use(arg); },
Some(temp @ 2) => { arg = temp; use(arg); },
Some(temp @ _) => { arg = temp; use(arg); },
None => { continue; },
}
I don't think this extended pattern syntax is any better. You still have to merge the expression and pattern syntaxes or limit the assignable lvalues. So it's an unfamiliar syntax instead of a familiar one.
@taralx Now that you mention it, I think so too. I forgot just how much can go on the LHS of an assignment. Nvm then, I retract my claim.
Again, let
is for declaring, not assigning. Reusing it for assigning will lead to confusions.
coming from Python it was surprising for me that this isn't possible. However: my usecase was to circumvent using a tmp variable, solved it with tuples instead:
let mut fib = (1,2);
while cond {
fib = (fib.1, fib.0+fib.1);
}
So it's a two part problem:
@Kimundi's implementation seems best if @taralx's concern is solved. I think it is worth it to abandon ref on LHS for this to work, but discussion welcome.
If a keyword is needed, I like @ldpl's mut
best. A bit tricky as "mutable" becomes "mutate", but it's still easy to understand.
If we insist an existing keyword is reused, match
seems the closest to me. Involving ref
in here risks confusion.
(I don't think ref
on an assignment LHS even makes sense.)
match
definitely seems to be a good keyword, making the destructuring more implicit, and also shorthanding another current possible workaround:
match f(a, b) { (x, y) => { a = x; b = y; } }
. IMO this is cleaner than let x, y = f(a, b); a = x; b = y;
which contaminates the scope more.
An extra keyword is only needed if we want to treat the left-hand-side as a pattern instead of an expression. But I don't think there's a good reason to do that.
In fact, the last example in the original post doesn't even use a valid pattern on the left-hand-side:
let my_point: Point;
(my_point.x, my_point.y) = returns_tuple(...);
All that's really necessary to make tuple/struct unpacking work are two rules for syntax desugaring (without any parser changes at all!):
If the left-hand-side of an assignment is a tuple expression:
($expr1, $expr2) = $rhs;
then desugar the assignment to a block:
{
let (tmp1, tmp2) = $rhs;
$expr1 = tmp1;
$expr2 = tmp2;
}
If the left-hand-side of an assignment is a struct literal:
Struct { field1: $expr1, field2: $expr2 } = $rhs;
then desugar the assignment to a block:
{
let Struct { field1: tmp1, field2: tmp2 } = $rhs;
$expr1 = tmp1;
$expr2 = tmp2;
}
I believe this is the approach suggested by @Kimundi; but I thought it's worth mentioning that this is basically just simple syntax sugar that can be desugared prior to type analysis. In fact this seems like it might almost be doable with macro_rules!
.
So there's really no reason to introduce new syntax unless you want to use different patterns in destructuring assignments than just tuples/struct literals. If so, which patterns do you want to use?
The only really useful pattern I can think of is _
. This could be done by extending the expression grammar with a _
expression, and rejecting any remaining use of _
after desugaring.
In general, we could probably support most patterns by extending the expression grammar in this way; but I doubt many types of patterns are worth it. I imagine tuples alone cover the vast majority of the use cases of this feature.
I'd agree tuples and newtype structs cover almost all use cases, but you'll want fixed length arrays along with regular and tuple structs too, and nested types work fine. I think ref
gets replaced by *
which sounds fine. And no need for @
bindings, guards, etc. No enum variants. I donno the rust grammar but this sort of lvalue expression looks context free at first blush.
What about arrays more generally? https://github.com/rust-lang/rfcs/pull/495
What about using something like tie from c++?
let (a, b) = (3, 4);
...
tie (a, b) = (5, 6);
that is just as easy to write and needs no complex extension of the parser.
@torpak I thought the parser problem is a solved one: don't try to have full pattern syntax oh the LHS, handle oh the intersection with expressions.
Since tie
is not already a keyword, what you wrote parses. In fact...
tie(a, b).x = (5, 6);
could even pass all checks and run.
The issue here from what I can tell is people don't seem to like the "intersection of expressions and patterns" approach even when it could work well for most cases.
Is there a properly formatted RFC for this feature yet?
I am not aware of any.
so it's only because no RFC for this yet? and no committers have reviewed ? can we call a few of them by mentioning
coming from Python world, this is very natural to calculate Fibonacci numbers, by tuple assignment:
In [1]: a, b = 1, 1
In [2]: for i in range(10):
...: a, b = b, a+b
...:
In [3]: a, b
Out[3]: (89, 144)
and Javascript ES7 does not have tuple, but have object construct and destruct:
> let a = 1, b = 1;
undefined
> for (let i in Array.from({length: 10})) {
... ({ a, b } = { a: b, b: a+b });
... }
{ a: 89, b: 144 }
and Javascript Array spread assignment, this is valid since ES6 (ES2015)
> for (let i = 0; i < 10; i++) [ a, b ] = [ b, a+b ];
[ 89, 144 ]
I dislike that this encourages unneeded mutability.
Instead of new assignment syntax for mutable variables, what about some assign/mutate/unlet syntax that locally altered the behavior of patterns from declaration to mutating assignment?
let mut field2;
let x = foo_mut();
loop {
...
let Struct {
field0, mut field1, // ordinary let declarations using destructuring with field puns
field3: assign!(*x) // assignment to mutate the referent of x
assign!(field3), // assignment to existing mutable variable field3 ala field puns
} = rhs();
...
}
It'd be obnoxious to write (assign!(a), assign!(b), assign!(c)) = rhs();
of course, but you should never do that anyways since invariably some elements should always be new immutable declarations instead of assignments to mutable variables. Also, this approach might work inside match
or while/if let
, and in more complex destructurings.
I picked a macro syntax here because it's only sugar declaring another binding and assigning to the mutable variable. Also, the ()
help delineate a full lhs term when dereferencing. I think unlet!()
, mutate!()
, or even mut!()
could all make reasonable names as well. Alternatively new names like assign
, unlet
, mutate
could work as keywords too, ala (a, unlet b, c) = rhs();
. I could imagine sigil based syntaxes or even using =
, ala (a, (b=_), c) = rhs();
, although field puns might be problematic.
As a rust newbie I'd just like to add my +1 that it's definitely confusing that you can do
let (a, b) = fn_that_returns_tuple();
but not
(a, b) = fn_that_returns_tuple();
I understand there are issues with implementation, but @dgrunwald seems to be saying there's a simple option that would work in the majority of cases that people actually care about - I think that would be worth doing!
@burdges In pure languages this is unnecessary at least partly because loops like @tstorch's can (or must) be easily turned into recursive helper functions. But in Rust, with tail call optimization not guaranteed, as well as its goal of attracting audience from dynamic languages, I think destructuring assignment is a reasonable compromise.
Would it be possible to allow tuples/structs as actual lvalues, so that we don't need to differentiate between pattern/expr at all?
@phaux I would hope so (syntactically), preferably in a way that (*x, y.field) = (a, b)
works.
Hm, now that's an idea, though I'm not sure how backwards compatible it would be:
let a = 5;
let b = "hello".to_string();
let (ref x, ref y) = (a, b); // this would move `a` and `b` today, but just reference them with constructors becoming lvalues.
That said, I think it wouldn't work anyway, as you would need a single canonical address for a constructor lvalue - which means constructing it from other lvalues would not really work (unless we special case it in the language, such that you get only an error if you attempt to take the address of a constructor lvalue, or magically let it behave as a rvalue in that case)
@Kimundi I don't think it's plausible to have something like that, no, I interpreted @phaux's to refer to the alternative I like which is keep parsing expr = expr
but interpret the LHS differently than the RHS, without involving patterns or creating some sort of "value" for the LHS.
@eddyb Exactly.
Ah, then I misunderstood. So basically one of the things which have been proposed already.
If Foo
is Sized
and big, then the API will handle fn foo() -> Foo
by creating an uninitialized Foo
passing foo
the pointer to it, yes? So Foo { whatever } = foo();
requires creating a temporary, right? You presumably meant that semantically, but not sure I understand that either.
I'm still nervous about encouraging mutability, but could the syntactic issue be handled by "commuting" the let
inside the "pattern"? So
let a; // uninitialized here
let mut b = bar(); // mutable
let c = baz() : &mut u64; // mutable reference
virtual Foo { a, b, *c, let d, let mut e } = foo(); // named field puns
// only a and d are immutable
You could replace virtual
with another keyword, or remove it entirely, except people worried about the grammar complexity up thread. You could not however do let r = virtual Foo { a, b, *c, let d, let mut e }; r = foo();
because r
would be only partially initialized, which might be @eddyb's point.
Given
it would be nice to be able to do things like
and not just in
let
s, as we currently allow.Perhaps even:
(Most use cases would likely involve
mut
variables; but those examples would be longer.)Related issues from the
rust
repo: https://github.com/rust-lang/rust/issues/10174 https://github.com/rust-lang/rust/issues/12138