Open leafpetersen opened 2 years ago
The failure to pronounce the word "copy" clearly is what all current proposals have in common.
I think this is a step in the wrong direction. var
and final
are existing, well understood markers that "here a new variable is being introduced". #1201 correctly uses these markers, which is helpful, but I'm concerned about its re-use of the expression syntax to name the bound variable. Hence this issue.
I agree my proposed syntax is pretty magical. This is definitely an interesting approach for us to play with. Right now, I find it more confusing, but that could be familiarity. Here's what tripped me up:
if (var x from points[0] is int) x.isEven;
Here, there's nothing to clearly indicate that .x
is being called on the result of points[0]
. It took a few reads of your proposal before I realized that was going on. At first, I just though the from
was essentially like =
and x
was initialized with points[0]
, not points[0].x
.
At least with:
if (var points[0].x is int) print(x.isEven);
A confused reader can probably correctly guess that points[0].x
is being evaluated since that exact syntax is right there and does what they expect. Then they are probably stumped as to what the name of the resulting variable is since there's no clear answer. They might guess that it's x
, but if not, they'll at least try to figure out what's going on and learn the new syntax. It would be hard for them to guess wrong.
With:
if (var x from points[0] is int) x.isEven;
I think it might be too easy for users to mistakenly guess that it means something like:
if (points[0] is int; var x = points[0]) x.isEven;
They might guess that it's
x
, but if not, they'll at least try to figure out what's going on and learn the new syntax. It would be hard for them to guess wrong
I don't understand what you're trying to say here, can you elaborate? I don't understand why it's hard to "guess wrong" with the old syntax. An easy way to "guess wrong" is just to not notice the var
, or to just shrug and ignore it?
I think it might be too easy for users to mistakenly guess that it means something like:
if (points[0] is int; var x = points[0]) x.isEven;
Maybe. Perhaps a different syntax than from
might help? But I really do question your hypothesis here. With var field from obj
it is 100% clear to the user that:
I don't believe either of those statements are true about the original proposal, so I question the statement that it's hard for the user to guess wrong.
In any case, I think we may want some user data on this. I continue to find code like if (var obj is int) obj.isEven
and if (var points[0].x.y.obj != null) obj.isEven
pretty inexplicable, and would like some empirical evidence that I'm the only one who is confused by it.
IMHO any prefix kind of if-variables
will be kinda magical.
My first thoughts when reading these lines alone:
if (var points[0].x is int) print(x.isEven);
what is x
? Where is it declared? Is it a global variable?
if (var obj from this is int) obj + 1;
is obj
an int
that got promoted from the resolution of this is int
?
Please, consider a suffix style.
C# has it:
if (foo.bar is int daz) {
print(daz + 1);
}
I think we can have a better version of it:
if (foo.bar is int var daz) {
print(daz + 1);
}
This line: if (foo.bar is int var daz) {
can be easily read as if bar
is int, declare daz
.
With this style we could have var
and final
types.
We could also omit the name of the variable:
if (foo.bar is int var) {
print(bar + 1);
}
This is nice but then it smells magic.
For that reason I think we could have a new keyword use
.
use
alone creates the property as local final variable:
if (foo.bar is int use) {
print(bar + 1);
}
use
with an argument adds a name:
if (foo.bar is int use daz) {
print(daz + 1);
}
I think we can have a better version of it:
if (foo.bar is int var daz) { print(daz + 1); }
Serious question: why is that better than this?
var daz = foo.bar;
if (daz is int) {
print(daz + 1);
}
This version has two extra characters, and is something you can do today, so why add a whole new syntax for it?
With
var field from obj
it is 100% clear to the user that:
- A variable is being introduced
- What the name of the variable is
Agreed. What's less clear (I believe) is that the value of that variable is. And that's a larger concern to me because it's a runtime property of the program. They may think it has some value, think they are reading the code correctly, and have it do something unexpected at runtime. With the if-var syntax, if the reader is confused, they are confused about the static semantics and will notice their confusion before they try to run it.
Again, though, I agree if-var syntax is fairly magical too. It's hard to do anything that's more terse than just explicitly declaring a local variable without introducing some significant level of magic.
Agreed. What's less clear (I believe) is that the value of that variable is. And that's a larger concern to me because it's a runtime property of the program.
Scoping is pretty relevant to the runtime behavior of the program.
With the if-var syntax, if the reader is confused, they are confused about the static semantics and will notice their confusion before they try to run it.
I just don't see this. I agree that they are confused about scoping, but how is that better?
Let's try an example. Here's some existing code:
class Foo {
int? x;
void test(Foo other) {
if (var other.x != null) {
// Stuff
}
}
Hmm, I think when other.x
is not null, I'm going to change this code to set x
to other.x
. Easy right?
class Foo {
int? x;
void test(Foo other) {
if (var other.x != null) {
// Stuff
x = other.x;
}
}
Why doesn't this execute in the way that I expect? Because the scoping is very relevant to the execution of the program: the fact that this said var other.x != null
instead of other.x != null
changes the semantics.
I hear the critique of this alternative syntax - I agree that there is a potential confusion to make when reading it. I don't really understand the argument that it is a different category or severity of confusion - to the contrary, I believe that confusion about scoping is one of the most fundamental confusions one can have about a program. Correct resolution of variables is probably the most fundamental cognitive task in programming - without it, you cannot even build a correct syntactic model of the program in your head, much less a correct semantic model.
But you can beat it in terms of locality - exactly like C++ and golang do it.
if (var daz = foo.bar; daz is int) { print(daz + 1); }
FWIW, if we're not going to try to solve the "having to choose and bind a new name" problem (which is what the if-variables proposal was intended to solve) but we do believe that "locality" of the binding is worth having syntax for, something in this direction is what I'd be in favor of. But I'm not yet convinced that moving the var daz = foo.bar
inside the if
really provides enough value here to warrant new syntax: especially if we end up wanting to support the negative case and binding the variable in the other continuation. I'm not convinced that someone who was unhappy with the lack of field promotion before would suddenly think everything is roses because they can bind a variable with scope restricted to the if
. Maybe they would be? I think I'd want to see some UX data on it before I'd believe it though.
I agree that they are confused about scoping, but how is that better?
I'm speculating about the mental state of users, but with the original proposed syntax, if they are confused, their confusion is more like to force them to stop and learn the feature before they proceed. I worry that with the syntax you propose here, they may be confused, think they understand the syntax, and then proceed to believe their program means something it doesn't.
I could be wrong, but that's my gut feel from looking at the syntax.
I added another exploratory proposal to consider using pattern matching to avoid what I see as the orthogonal faults with these two proposals here.
I'll admit that I find var x from this
completely impenetrable.
As @munificent says, it's clear that x
is a variable, but it's completely not clear what the value is. The actual expression being evaluated does not occur in the source. It also only works on getter accesses, likely only instance getter accesses (unless var parse from int
is a valid way to extract a static method into a variable).
If it was var x from this.x
it'd be clearer, but then we might as well just do var x = this.x
as an expression (which I have proposed, and I do want in general as a kind of (better) let
construct).
What if
variables achieves over this more general feature, is to avoid repeating the name (and, possibly, avoid some parentheses). It costs the ability to bind any expression which doesn't end in an identifier. You can't do var elements[0] is int
as a test because it has no name (but if we allowed (var first = elements[0]) as int
you'd have a fallback).
If we want to highlight the name in the declaration, which is what I think the from
notation is trying to do, and still (mostly) maintain the actual expression, I'd prefer this.var x
, where we introduce the var
just before the naming selector. Still only works for named getters.
If the only expression that can follow var
in an expression context is a selector chain, and we take the name of the last selector, then moving the var
into the chain, just before that name, should always be possible.
It's not awesome syntax, foo.bar.var baz
is broken up by the space, making it less readable.
The syntax might allow foo.var bar.var baz.qux
for binding sub-expressions too, even if they are not immediately tested.
(In general, I don't think allowing binding expressions only for testing is optimal, we might as well allow them everywhere.)
About scoping, I'd very much like the variable to be in scope an place inside the same block/construct that is also dominated by the introduction. (Basically, where an uninitialized local variable in the same block/structure would be definitely assigned by an assignment in the same place as the var
expression.) Tying it to a test, and only some of the branches, seems unnecessarily restrictive, and something people will be encouraged to work around, not with.
I'd love if I could do:
if (var o.someFutureOr is Future<T>) {
// someFutureOr is Future<T>
} else {
// someFutureOr is T
}
I think we can have a better version of it:
if (foo.bar is int var daz) { print(daz + 1); }
Serious question: why is that better than this?
var daz = foo.bar; if (daz is int) { print(daz + 1); }
This version has two extra characters, and is something you can do today, so why add a whole new syntax for it?
IMO there are a couple of reasons... One is boredom.
Anything that makes us go back to "fix" something is seen badly by our brain.
How boring is when you have to go back in your code to change some already stablished logic because a new tool you are developing doesn't fit the current process flow? Or a simple modification like to have to change final
to var
when you need to update the variable?
Simply put: we don't like to go backwards. Even in the smallest things.
So a suffix is easier for our brains to catch up, ie: I am checking if this tool is nice. Hmm, now I will use it to build this block
- moving forward, yay!
With this pattern, post declaring the variables within if
s will go smooth and natural.
Now I am checking if this tool is nice. Let me use it. Damn I need to say that I will use it before I check if it is nice, let me go back and declare it
.
I am not sure if I was able to explain but the resume of it is that the actions, the verbs need to go forward so our brain can automatically add the keywords that bundles a logic.
But I do agree with you that the examples you quoted aren't the most productive, that is the reason that I would rather use the use
keyword:
Map foo(Bar bar, [Foo? foo]) => {
if (bar.xyz is Foo use)
'name': xyz.name,
if (foo?.xyz != null use)
'secondName': xyz.name
}
I actually don't think it's very hard to read the from
syntax, and I like the fact that it keeps var
and the variable name together. However, as @lrhn mentioned, we did have other proposals with the same property:
@lrhn wrote:
I'd prefer
this.var x
, where we introduce the var just before the naming selector. Still only works for named getters.
this.var:x
is the way that's done in the binding expressions proposal. That isn't strictly limited to named getters, I included the case where an instance member invocation passes arguments, using the method name as the default name of the new variable, and allowing for a new name if needed.
void main() {
var s = 'This is a long string of text';
print('${s.var:substring(8)} ${substring.var s1:substring(7)} $s1 $s1 $s1.');
// Prints 'a long string of text string of text string of text string of text.'
}
It is of course a rather unprincipled approach to use the method invocation as the source of the value of the new variable (rather than tearing off the method), but I suspect that it's much more usable in practice.
IMO there are a couple of reasons... One is boredom. Damn I need to say that I will use it before I check if it is nice, let me go back and declare it.
"Boring" is exactly the right word here. I'd say it's even worse: it's boredom mixed with exasperation. That's what makes 2-clause "if" statement so good: it doesn't feel boring. The point is that whenever you use "if (expr ...)", you most often will need the same
expr
in the "then" clause, so it will soon become your second nature to think about introducing a local variable:if (var s=expr; cond)
. It's good not only for non-null promotion - it helps to avoid repetition, thus making things less boring, not more.
I really wish I didn't have to type a full declaration inside an if
like that.
@tatumizer wrote:
But WHY? What is the goal?
The goal is to allow us as developers to capture the value of an expression used as a part of a bigger construct (say, a bigger expression or an if
statement, etc.), introducing a new variable with a suitable scope. For instance:
class C {
num n;
...
void foo() {
if (var:n is int) {
n.isEven; // The new variable is in scope and promoted.
} else {
foo(n); // The new variable is in scope.
}
// The new variable is out of scope here.
}
var:n
introduces a new variable named n
and initializes it with the value of n
in the enclosing scope (which is the body scope of the function foo
, because every if
statement is enclosed by a new scope; so that's the instance variable). This is concise, rather general, and, I think, reasonably readable.
So why not? ;-)
what about:
if (obj.@prop != null)
print(prop); // prop local and not null
if (obj.@prop is Foo)
print(prop); // prop is Foo
for array index checks:
if (obj.elements@{i}[0] != null) {
print(i); // i is local and not null
}
property checks like obj.@prop
are probably the default so I don't think is a big deal to add {}
when naming indexes checks
"Readable" is a very subjective word :)
I actually like var s = expr; expr
better than most of the alternatives. It follows an existing pattern from for
: (intiializer; test
.
I'm slightly worried that if we allow it in while
statements too, while (var x = this.x; x != null)
will suggest (because of the for
precedent) that the declaration is only executed once, and I'd probably want to execute it on each iteration. Then again, the for
has the increment part for later iterations, this one doesn't.
Doesn't read that well inside an actual for
: for (;(var x = this.x; x != null);) { this.x = x.next; }
.
(Personally, I like the in-place var
-declaration: if ((var x = this.x) != null) { ... x ... }
. Readability is subjective.)
We could also compare it with the following:
if (var r: rechtsschutzversicherungsgesellschaften is int) {
r.isEven; // The new variable is in scope and promoted.
} else {
foo(r); // The new variable is in scope.
}
Nobody says you have to use the existing name, that's probably not required for any of these many proposals. ;-)
shadowing should be frowned upon rather than welcomed
Who knows? Shadowing should be frowned upon when it occurs by accident, such that one declared entity is far too easily believed to be another one, typically with no other connection between the two entities than the unfortunate choice of the same identifier. That's a footgun.
However, in the case where the two entities are closely related it could be a useful style rule to encourage shadowing. For example, we may need to access a given object using a local variable such that we can use promotion, and we'll initialize that variable by evaluating a non-local variable. In that case the two variables are conceptually "the same thing". This is definitely a very different situation than the footgun mentioned earlier. If the two variables do end up having different names (for whatever reason) then we might actually want to pair up the two names by making them similar (e.g., foo
and localFoo
).
In other cases we're not initializing the new variable from an existing one (e.g., we could initialize it from any expression, and we could take a default name for the variable from part of that expression; for instance, getter invocations like a.b
have been mentioned many times as a device that yields the default name b
). Of course, in these cases there is no shadowing.
none of these proposals lead to a shorter or more readable or more intuitive or more familiar syntax than plain 2-clause if
Well, you can't really deny that a.var:b is T
is shorter than var b = a.b; b is T
.
Familiarity is really difficult to prioritize. It may help you for about 5 seconds, and it's important for the community that as few as possible are turning away from the language immediately because it is full of unfamiliar constructs. But I do think that long term qualities are more important than familiarity.
Being intuitive basically means having properties (affordances) that are quickly and effortlessly available to the mind of a reader of the source code: You don't have to think hard and in terms of explicit reasoning in order to understand how to use it. I think that may occur because of familiarity, but, more importantly, I think it may occur in a more profound sense because of semantic consistency: It may well be based on an implicit and approximate kind of thinking (that's what makes it "intuitive" rather than just "logical and consistent"). But if the language can be understood in terms of a minimal number of semantic rules that are applied consistently across many different language constructs and concepts, then a reader of the source code can rely on those rules to understand systematically and precisely what a given snippet of code does. That's basically the notion of orthogonality in language design.
On top of this comes the need for abstraction: If we eliminate nearly all language features (think: BASIC 1977) then the code is immediately easy to understand, line for line, but it may be a lot harder to understand a larger software artifact like a whole program, because we're drowning in a large amount of repetitive code, unable to spot the big picture.
So familiarity is always a good starting point, but if we need to be a little bit unfamiliar in order to have a consistent language design then I honestly do believe that we should keep the long term qualities of the language in mind.
When it comes to a.var:b is T
vs. var b = a.b; b is T
, the former is more concise (it doesn't repeat b
three times), but the latter reuses a complete local variable declaration, and it may seem to be immediately readable. However, the reader of the latter form would need to understand some new rules about the scoping, and that could be considered a consistency issue.
The trade-off is not simple. That's probably the reason why we're having so much fun. ;-)
The two-clause if
can definitely work. It doesn't necessarily mean that it also works for while
, but it really should, and for the conditional expression (?
/:
) too.
They have the same issues, and should be provided with the same solutions. I don't see why if
is special, other than it's the most common use-case.
If anything, it's more important for while
and the conditional expression. For if
, the difference between:
if (var x = this.x; x != null) ...
and
var x = this.x;
if (x != null) ...
is tiny, you can always have a declaration just before the if
. For while
, the variable declaration is evaluated on each iteration:
Node node = ...;
while (var next = node.next; next != null) {
node = next;
}
You can't do that by moving the variable outside (or rather, you can because it's nullable, then it's:
Node node = ...;
Node? next;
while ((next = node.next) != null && next != null) {
node = next;
}
(The && next != null
is needed because (node = node.next) != null
doesn't promote node
. It should. See #1420.)
For the conditional expression, you can't have a declaration just before the expression. So, if
is the least interesting case.
We also have to define what the scope of the variable is, because we do want it to extend to the body of the if
(while
and ?
:/
), and to both branches. Probably not anything outside of that.
If we only allow the variable inside tests and extend the scope to the following branches, then it's well-defined.
HOWEVER (ran out of emphasis there), if we allow it for two-clause tests in the conditional expression, we have effectively introduced an ugly let
expression:
(var x = anything; false)?_: somethingUsing(x, x)
where _
is a helper getter with type Never
.
That's an expression local declaration. It's ugly, but it's there. If we have the functionality anyway, we should embrace it and let you have a non-ugly version too.
So, as @mit-mit suggested to me, what if declaration;expression
is an expression everywhere (except a choice few places where we disallow it unparenthesized for syntactic reasons, like directly inside a for(;;)
header).
expression ::= declaration_expression | expression_no_decl
declaration_expression ::= var_declaration `;' expression_no_decl
expression_no_decl ::= ... current expression ...
and we define that the scope of the declaration is the following expression and any construct can say that variables definitely declared in one part also applies in another:
?
/:
, if the test is a declaration expression, its variables apply to the branches as well.??
, &&
and ||
, if the first expression is a declaration expression, its variables apply in the second expression too.if
, while
and for(;;)
statements, if the test is a declaration expression, its variables apply to the branches/body too.(Maybe "is a declaration expression, possibly wrapped in any number of parentheses").
MO there are a couple of reasons... One is boredom. Anything that makes us go back to "fix" something is seen badly by our brain.
@jodinathan Thanks for walking through your reasoning. I agree there's a possibly non-rational but nonetheless very real difference there. Another way of phrasing it in my mind is that since you end up tying the variable declaration to the if
, it feels ad hoc to have to juxtapose two otherwise unrelated statements in order to achieve what you want.
@lrhn
So, as @mit-mit suggested to me, what if
declaration;expression
is an expression everywhere (except a choice few places where we disallow it unparenthesized for syntactic reasons, like directly inside afor(;;)
header).
This is, of course, just a syntax for a let
expression, which you have been arguing against elsewhere... :)
and we define that the scope of the declaration is the following expression and any construct can say that variables definitely declared in one part also applies in another:
?
/:
, if the test is a declaration expression, its variables apply to the branches as well.??
,&&
and||
, if the first expression is a declaration expression, its variables apply in the second expression too.if
,while
andfor(;;)
statements, if the test is a declaration expression, its variables apply to the branches/body too.
This starts to feel kind of ad hoc, but maybe I could learn to live with it. I think you probably want the variable to apply in the failure continuation of an if
regardless of whether that continuation is an else
branch or just the subsequent statement though.
I would like to point out that if we are not going for a suffix style, then the @eernstg proposal seems more robust in my opinion. It is a different syntax, however, it leads to less doubts on what is happening:
class C { num n; ... void foo() { if (var:n is int) { n.isEven; // The new variable is in scope and promoted. } else { foo(n); // The new variable is in scope. } // The new variable is out of scope here. }
I also think we could think on how to make it shorter. Maybe using something like @
:
if (obj.@prop != null)
print(prop); // prop local and not null
if (obj.@prop is Foo)
print(prop); // prop is Foo
// naming fits nicely because it is very close to string concatenation pattern:
if (obj.elements@{i}[0] != null) {
print(i); // i is local and not null
}
One of the primary concerns with the original if-vars issue was that the code can be surprising to those that try and read it without knowing about the feature. I think that is a fair concern. I think this issue is caused by using a familiar keyword (var
), and giving it new semantics in if
statements. So, inspired by the let discussion above, how about using an actual let
keyword? That makes it clear that this a different construct from var
, and if a reader doesn't know what it means, they are aware of it being different, and can google something like "dart let if" and hopefully be taken to a page explaining it.
It would look like this in the case with renaming:
if (let m = myNullableField; m != null) {
m.method1();
}
It would be nice to also support not having to come up with a new variable name if you don't want one. Based on earlier discussions of other places where we'd like to refer to something that hasn't been named (e.g. https://github.com/dart-lang/language/issues/265), how about using it
for that? It would look like this:
if (let myNullableField != null) {
it.method1();
}
what about the Zig If 's?
if (obj.prop is Foo) |prop| print(prop);
if (first != null && second != null) |one, two| {
print(one + two);
}
if (first != null && second != null) |_, it| {
it.call();
}
if (first != null || second != null) |it| {
fn(it);
}
or
if (obj.prop is Foo) : obj.prop {
print(prop);
}
if (first != null && second != null) : one, two {
print(one + two);
}
if (first != null && second != null) : _, it {
it.call();
}
if (first != null || second != null) : it {
fn(it);
}
@mit-mit wrote, about in-expression declarations using var
:
the code can be surprising to those that try and read it without knowing about the feature
Indeed, but I don't think it's realistic to expect any language mechanism to be precisely "guessable" at first sight, it's reasonable to require the language mechanism to be introduced to every developer who's going to use it, at least briefly. But it would be helpful if we can rely on existing knowledge to provide most of the information.
Note that it was part of the design of the binding expressions (#1210) that they use var
with :
, such that it's visibly different from a regular local variable declaration (reminding the reader that it behaves differently as well), and still contains var
as a reminder that it is a declaration.
I've split out the proposal for if-scoped variables into an issue here. I think all of the other basic proposals are covered by some issue somewhere.
I'll leave this issue open for a bit longer in case there's any follow on discussion folks want to have, but I think it's clear there's not much buy in on the team for the approach described here.
In this issue, @munificent proposes a way of re-using a property name as a local variable in a specific scope to enable promotion on properties. The core idea is appealing, but I find the syntax surprising and unapproachable. I find it hard to believe that a developer not already familiar with the feature would understand what was happening with (for example) this code:
This issue is to explore an alternative syntax for the same feature. That is, I don't propose any changes to the underlying semantic notion - merely a change in syntax. The above proposal re-uses the existing property access syntax to produce a new variable binding syntax. I propose instead to start with the existing variable binding syntax and add a new way access a property. Specifically, I propose to replace the general syntactic form
var e.x is T
withvar x from e is T
;var e.x == null
withvar x from e == null
; and similarly for the negative forms. Whene
is an implicitthis
access, I propose to require that thethis
be explicitly specified (more on that below).Simple Example
Continuing by example, using the examples from the above proposal.
Here, I require the implicit
this
to be made explicit. We could choose not to do this for the sake of brevity, but I believe that making this explicit makes the code much more readable. I find theif( var obj is int)
syntax particularly impenetrable, since it is so very close to the existing variable binding syntax. I also believe that the vast majority of cases of interest are not accesses onthis
, but rather accesses on other objects.Promoting on null checks
Promoting on getters
Negative if-vars
Worked example
New syntax for the worked example from here:
New syntax from the worked example from here.
cc @munificent @lrhn @eernstg @jakemac53 @natebosch @stereotype441 @mit-mit