dart-lang / language

Design of the Dart language
Other
2.67k stars 205 forks source link

If-variables #1201

Open munificent opened 4 years ago

munificent commented 4 years ago

This is a proposal for how to handle the lack of field promotion with null safety (though it covers more than just that).

The big hammer in the language for making nullable types usable is flow analysis and type promotion. This lets the imperative code that users naturally write also seamlessly and soundly move nullable variables over to the non-nullable type where the value can be used.

Unfortunately, this analysis isn't sound for fields and getters, so those do not promote:

class C {
  Object obj;
  test() {
    if (obj is int) obj + 1; // Error. :(
  }
}

One option is to enable promotion in the cases where using the field is sound, but the boundary there is subtle, it's easy to move a variable across it, and may be too narrow to cover most cases. Another option is to automatically promote fields when failure to do so would cause a static error. That trades static failures, which let users know their code is unsound, with a runtime error that could cause their program to crash.

Given that the trend in Dart is away from code that may silently fail at runtime, I'm not enthusiastic about the latter approach. This proposal describes a feature called "if-variables" that is local, sound, efficient, explicitly opted in (while being concise), cannot fail at runtime, and covers a larger set of painful cases than any of the other proposals.

It looks like this:

class C {
  Object obj;
  test() {
    if (var obj is int) obj + 1; // OK!
  }
}

Basically, take the code you would write today that doesn't promote and stick a var (or final) in front of the if condition. That keyword means "if the type test succeeds, bind a new local variable with the same name but with the tested type". In other words, the above example is roughly syntactic sugar for:

class C {
  Object obj;
  test() {
    if (obj is int) {
      var obj = this.obj as int;
      obj + 1; // OK!
    }
  }
}

This binds a new local variable. That means reading it later does not read the original backing field and assigning to it does not assign to the field, only to the local variable. This is what makes the proposal efficient and sound. The var keyword should hopefully make it clear enough that there is a new local variable in play.

Promoting on null checks

You can also use if-var with nullability checks:

class C {
  int? n;
  test() {
    if (var n != null) n + 1; // OK!
  }
}

Promoting getters

The tested value can be any expression as long as it ends in a named getter:

class C {
  List<Point<num>> points = [Point(1, 2)];
  test() {
    if (var points[0].x is int) x.isEven; // OK!
  }
}

In this case, the last identifier in the selector chain is the one whose name is used for the newly bound variable. The expression is only evaluated once, eagerly, and the result is stored in the new variable.

So not only does this let you promote a getter, it gives you a very nice shorthand to access the value repeatedly.

Negative if-vars

The above examples all test that some value has a promotable type. You can also test that the variable does not have the type and then exit:

class C {
  Object obj;
  test() {
    if (var obj is! int) return;
    obj + 1; // OK!
  }
}

When using is! and == null, the then branch of the if statement must exit by return, throw, etc. The newly-bound variable goes into the block scope surrounding the if statement and continues to the end of the block. In other words, the desugaring is something like:

class C {
  Object obj;
  test() {
    int obj;
    if (obj is! int) return;
    obj = this.obj as int;
    obj + 1; // OK!
  }
}

Proposal

There are basically two separate statements here:

Here is a somewhat more precise description. We change the grammar like so:

ifStatement         ::= "if" "(" expression ")" statement ( "else" statement )?
                      | positiveIfVariable
                      | negativeIfVariable

positiveIfVariable  ::= "if" "(" ifVariable positiveTest ")" statement ( "else" statement )?
negativeIfVariable  ::= "if" "(" ifVariable negativeTest ")" statement

ifVariable          ::= ( "var" | "final" ) ifValue
ifValue             ::= ( ( primary selector* | "super" ) ( "." | "?." ) ) ? identifier
positiveTest        ::= receiver? identifier ( "is" typeNotVoid | "!=" "null" )
negativeTest        ::= receiver? identifier ( "is" "!" typeNotVoid | "==" "null" )

As far as I know, this is unambiguous and compatible with the existing grammar.

Positive if variables

It is a compile time error if the then statement is a block that declares a local variable whose name is the same as the identifier in ifValue. In other words, the new variable goes in the same block scope as the then block and you can't have a collision.

To execute a positiveIfVariable:

  1. Evaluate the expression ifValue to a value v.
  2. Use that value to perform the appropriate type or null test in the positiveTest. If the result is true:
    1. Create a new scope and bind the identifer from ifValue to v.
    2. Execute the then statement in that scope.
    3. Discard the scope.
  3. Else, if there is an else branch, execute it.

Negative if variables

It is a compile time error if the end of the then statement is reachable according to flow analysis.

It is a compile time error if the block containing the if-var statement declares a local variable whose name is the same as the identifier in ifValue. The scope of the declared variable begins before the if-var statement and ends at the end of the surrounding block. The variable is considered definitely unassigned inside the then branch of the if-var statement and definitely assigned afterwards.

To execute a negativeIfVariable:

  1. In the current scope, declare a new variable named with the identifer from ifValue.
  2. Evaluate the expression ifValue to a value v.
  3. Use that value to perform the appropriate type or null test in the negativeTest. If the result is true:
    1. Execute the then statement.
  4. Else:
    1. Assign v to the variable.

Questions

Compatibility?

Since this claim new currently-unused syntax, it is backwards compatible and non-breaking. We can add it before or after shipping null safety.

Is the local variable's type declared to be the promoted type or promoted to it?

In other words, is the desugaring like:

class C {
  Object obj;
  test() {
    if (obj is int) {
      int obj = this.obj as int;
    }
  }
}

Or:

class C {
  Object obj;
  test() {
    if (obj is int) {
      Object obj = this.obj as int;
    }
  }
}

I suggest the former. Mainly because this prevents assigned an unexpectedly wide type to the local variable. Attempting to do so likely means the user thinks they are assigning to the original field and not the shadowing local variable. Making that a static error can help them catch that mistake.

What about pattern matching?

You can think of this feature as a special pattern matching construct optimized for the common case where the value being matched and the name being bound are the same. I think it's unlikely that this syntax will clash with a future syntax for pattern matching, even if we allow patterns in if statements. The var foo is Type syntax is pretty distinct because it mixes both a little bit of an expression and a bit of a pattern.

What about other control flow combinations?

The positive and negative forms allowed here don't cover every possible valid combination of control flow, scoping, and unreachable code. In particular, we could also allow:

class A {
  Object obj;
  test() {
    if (var obj is! int) {
      ...
    } else {
      obj; // If-variable in scope here.
    }
    // And not here.
  }
}

This isn't particularly useful. You can always swap the then and else cases and turn it into a positive conditional variable.

Also:

class B {
  test() {
    if (var obj is! int) {
      return;
    } else {
      obj; // If-variable in scope here.
    }
    obj; // And also here.
  }
}

There's no real value in allowing an else clause when the then always exits. You can just move the code out of the else to after the if.

Finally:

class C {
  test() {
    if (var obj is int) {
      obj; // If-variable in scope here.
    } else {
      obj; // Definitely unassigned here?
      return;
    }
    obj; // If-variable in scope here too.
  }
}

This one is particularly confusing, since there's a region in the middle where you really shouldn't use the variable.

I don't propose we support these forms. I want it to be clear to users when the conditional variable is scoped to the if statement's then branch and when it goes to the end of the surrounding block. The fewer forms we support, the easier it is for users to understand that.

jefflim-google commented 3 years ago

+1 to the idea of if-variables.

One thought: would 'final' be more aligned given its existing meaning, or would 'var' and 'final' both be usable, where 'var' would allow writing to the local variable of the restricted type? Having 'a' be assigned inside the statement seems like it could lead to confusing code.

if (final a != null) {
  // Use a as non-null, but assignment to 'a' is not allowed.
}
munificent commented 3 years ago

would 'final' be more aligned given its existing meaning, or would 'var' and 'final' both be usable

The proposal allows both var and final, yes.

jefflim-google commented 3 years ago

If 'var' is used, does writing to it affect the underlying value, or a shadow copy of it? From the proposal, it seems like a local copy.

if (var someObject.a != null) {
  a = 1; // Seems like this just writes to a local 'a'.
}

In this case, any thoughts on how private variable names should work?

if (final _fieldValue != null) {
  // `_fieldValue` or `fieldValue` here?
}

Using a local fieldValue (rather than _fieldValue) here would help reinforce that it is a local variable that doesn't affect the underlying variable, and help to remain aligned with effective dart's "DON’T use a leading underscore for identifiers that aren’t private."

lrhn commented 3 years ago

You introduce a new variable with the chosen name. That means it shadows the original thing you read. If you want to write back to the original, you need to write someObject.a = 1;.

The new variable has the same name as the variable (or getter) that was read to create it. That's the variable that you now have the value of, and therefore don't need to read again. If we changed the name from _fieldVariable to fieldVariable, we'd now potentially shadow another variable named fieldVariable, and we'd then need a way to avoid that. It's much simpler to just use the same name. (Also, we never treat name and _name as being related in any other place, they are different names, as different as name1 and name2.)

jefflim-google commented 3 years ago

(Also, we never treat name and _name as being related in any other place, they are different names, as different as name1 and name2.)

Perhaps going off on a tangent, but might this. constructor forms also potentially have this kind of relationship in the future for private initialization?

class MyClass {
  MyClass.positional(this.value);
  MyClass.optionalPositional([this.value]);
  MyClass.named({required this.value});
  MyClass.optionalNamed({this.value});

  final int value;
}

But for privates:

class MyClass {
  MyClass.positional(this._value);
  MyClass.optionalPositional([this._value]);

  // Not possible: MyClass.named({required this._value});
  // Not possible: MyClass.named({required this.value});
  MyClass.named({required int value}) : _value = value;

  // Not possible: MyClass.optionalNamed({this._value});
  // Not possible: MyClass.optionalNamed({this.value});
  MyClass.optionalNamed({int value}) : _value = value;

  final int _value;
}

Note: Field parameters shadowing other variables can happen, even in the current proposal.

var a = 0;
if (var otherObject.a != null) {
  // No way of accessing var a above here.

Though I readily admit keeping underscore for the local is probably easier conceptually and implementation wise.... just writing some random thoughts.

leafpetersen commented 3 years ago

I do wonder whether we want to either make these variables always final by default, or only allow the final form. It feels to me like there's a nasty footgun here when you're dealing with fields from your own class. It feels extremely easy to forget that you've shadowed a field with a variable and to try to write back to it:

class C {
  Object obj;
  test() {
    if (var obj is int) {
       obj = obj + 1; // PROBABLY NOT WHAT YOU WANTED TO DO!
    }
  }
}
Levi-Lesches commented 3 years ago

That sounds a lot like the shadow proposal, #1514. As an example:

class C {
  Object obj;

  void test() {
    shadow obj;  // references to obj are local, but are synced to the field
    if (obj is int) obj++;  // increments both the local variable AND the field
  }
}
jodinathan commented 3 years ago

I do wonder whether we want to either make these variables always final by default, or only allow the final form. It feels to me like there's a nasty footgun here when you're dealing with fields from your own class. It feels extremely easy to forget that you've shadowed a field with a variable and to try to write back to it:

class C {
  Object obj;
  test() {
    if (var obj is int) {
       obj = obj + 1; // PROBABLY NOT WHAT YOU WANTED TO DO!
    }
  }
}

Quick thinking this seems an exception instead of the common case.

jodinathan commented 3 years ago

I liked the idea but this is so weird to read:

if (final o=obj, o1=obj1; o is int && o1 is int) {
   // use o, o1
}

I think we should make it a suffix because that is how our brain separates actions, ie: do this then do that.

if (obj is int use obj && arg != null use safeArg) {
   // the new local obj is final and an int
   // the new local safeArg is final and non null
}

Because of the word use, which is a verb (denotates action) and it is a suffix, I think there is very little chance to the above mislead in anyway on what is happening.

But it could be var or final:

if (obj is int var obj && arg != null final safeArg) {
   // the new local obj is variable and an int
   // the new local safeArg is final and non null
}
jodinathan commented 3 years ago

Used C/C++ a lot 15 years ago. I wouldn't choose all things from it blindly.

if (final o=obj, o1=obj1; o is int && o1 is int) {
   // use o, o1
}

Would you mind exposing the advantages here?

For me is basically the same amount of work that we already do:

 final o=obj, o1=obj1;

 if (o is int && o1 is int) {
    // use o, o1
 }

In fact, reading again, I find the current Dart syntax better to read than the C++ version.

ykmnkmi commented 3 years ago

Similar to Python and Go walrus operator, but shorter, like labeled blocks:

if (final user: object is User && /* var */ data: user.data /* != null */) {
  data = callback(data);
  // ...
  send(user.email, messageFor(data));
}
jodinathan commented 3 years ago

Similar to Python and Go walrus operator, but shorter, like labeled blocks:

if (final user: object is User && /* var */ data: user.data /* != null */) {  data = callback(data);

IMO it looks like it returns a bool from object is User

stereotype441 commented 3 years ago

C# apparently has a pretty tight syntax for this sort of thing that allows the user to specify a variable name:

if(a.b is int y) {
  doSomethingWithInt(y);
}
munificent commented 3 years ago

C# apparently has a pretty tight syntax for this sort of thing that allows the user to specify a variable name:

if(a.b is int y) {
  doSomethingWithInt(y);
}

Yeah, I considered that. Like @tatumizer notes, it doesn't extend gracefully to == null checks. Also, it does require you to come up with a name for the bound variable, which is both a pro (explicit) and con (redundant).

rubenferreira97 commented 1 year ago

With pattern matching coming out, could we also extend this proposal to allow when-variables?

Example:

enum Difficulty {
  easy(1, "Easy"),
  medium(2, "Medium"),
  hard(3, "Hard");

  static Difficulty? fromId(int id) => Difficulty.values.firstWhereOrNull((d) => d.id == id);

  const Difficulty(this.id, this.designation);
  final int id;
  final String designation;
}

class Game {
  final Difficulty difficulty;
  final int levels;
  final String name;
  Game(this.difficulty, this.levels, this.name);
}

void main() async {
  final json = jsonDecode('{"name": "name, "levels": 100, "difficultyId": 1}');
  if (json case {
    'name' : final String name,
    'levels' : final int levels,
    'difficultyId': final int difficultyId,
    } when (final difficulty = Difficulty.fromId(difficultyId) != null)) {
      final game = Game(difficulty, levels, name);
    // insert game and return 200 ok
   }   
   //return 400 bad request
}
heralight commented 1 year ago

A more generic way, could be something more like C# 'out' parameter modifier https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/out-parameter-modifier

For example instead of:

final uri = Uri.tryParse(str);
if (uri != null) {
    ...
}

you will have something like:

if (Uri.tryParse(str, out uri)) {
   ... play with uri
}

and the tryparse signature:

bool tryParse(String str, out Uri uri)

best regards,

Alexandre

lrhn commented 1 year ago

The C# out parameter is just a reference parameter, with the added requirement that the method does assign to the variable. They allow you to declare the variable in-place.

Which means that it wouldn't work for tryParse which won't assign if it doesn't match. Or rather, the signature would be bool tryParse(String input, out Uri uri); and it will have to assign a dummy Uri to uri even when it doesn't match.

An out parameter makes sense in some APIs, but usually returning a record would be just as good. The requirement that it's assigned before the call exits means it's not viable for async functions or generators, only functions which return synchronously.

In the next Dart, I'd use patterns:

if (Uri.tryParse(source) case var uri?) {
  // Use uri
}
heralight commented 1 year ago

The C# out parameter is just a reference parameter, with the added requirement that the method does assign to the variable. They allow you to declare the variable in-place.

Which means that it wouldn't work for tryParse which won't assign if it doesn't match. Or rather, the signature would be bool tryParse(String input, out Uri uri); and it will have to assign a dummy Uri to uri even when it doesn't match.

An out parameter makes sense in some APIs, but usually returning a record would be just as good. The requirement that it's assigned before the call exits means it's not viable for async functions or generators, only functions which return synchronously.

In the next Dart, I'd use patterns:

if (Uri.tryParse(source) case var uri?) {
  // Use uri
}

@lrhn Thank you for pointing me this pattern matching solution.