dart-lang / language

Design of the Dart language
Other
2.65k stars 202 forks source link

Allow Multi-Statement Bodies in Switch Expressions #3117

Open caseycrogers opened 1 year ago

caseycrogers commented 1 year ago

This is a spinoff issue from the following: https://github.com/dart-lang/language/issues/3061

I propose we create a new syntax to allow multi-statement bodies inside of switch expressions.

Why?

  1. Having a one statement expression often gets very hard to read. Especially when doing complicated and/or async processing:
    return switch (someInt) {
    0 => 'No message',
    // Could use `then`, also poor readability.
    // Gets much worse as more layers of processing are needed.
    int i => (await asyncGetRecord(i)).$2,
    };
  2. Having independent syntaxes for switch statements and switch expressions is inconvenient, bringing the switch expression into closer feature parity with the switch statement would allow it to be used in many places where only a switch statement is usable/practical today. Right now, every time I have to write a switch, I have to think carefully about which syntax meets the current scenario:
    • Do I want it to evaluate? Then expression syntax.
    • Oh wait, I need multi-statement bodies, better use statement syntax with return inside a helper function ^this is a really annoying refactor These tradeoffs are pretty non-obvious and especially confusing given that both types of switches use the same keyword which implies feature parity. If the expression syntax more or less has feature parity, then it becomes the safe default (and probably rapidly just becomes the approach developers use in all scenarios in practice).
      // This is a switch statement that could be easily converted to a switch
      // expression if multi statement bodies were supported.
      switch (someInt) {
      case 0:
      someSideEffect();
      someOtherSideEffect();
      default:
      defaultSideEffect();
      }
  3. Allowing a switch expression that also side effects. This one is pretty weak as many will (reasonably) consider this an anti pattern, but I figured I'd include it for completeness. In the simplest case, this could just be a print statement for debugging.
  4. This one is pretty controversial, but one I believe strongly in it and I think it's worth mentioning. Though I believe this proposal is valuable whether or not you believe in this point. With multi-statement expression bodies, we've largely eliminated the need for the statement syntax and it could be a lint against it for those with strong opinions or, if the community really takes to using the expression everywhere, we could consider deprecating the statement syntax to reduce language complexity.

Syntax

There are a lot of different ways to do this-I don't have the strongest opinions. Here are some options I've come up with and have seen suggested by others:

  1. Curly braces that evaluate to the last line in the block-as in expression based languages.
    switch (someInt) {
    0 {
    var record = await asyncGetRecord();
    record.$1;
    },
    _ => 'some string',
    }

    This has minimal verbosity and maps pretty well to single statement vs multi-statement anonymous functions. However, it's pretty implicit and may take a fair amount of adjustment for developers to get use to it. 1a. Same, but instead of implicitly evaluating to the last line, you use return. This has the most parity with functions which will make it more intuitive, but it shadows the function level return so you can't short circuit out of the switch-not great. 1b. Same, but yield instead of return. Yield is a bit confusing as it's pretty associated with generators. Also this will shadow the function level yield if you are using a switch in a generator. 2a-b. Same as 1a-b, but arrow and then curly braces _ => { ... } I don't love that this conflicts so heavily with anonymous function syntax-every time the developer goes to write a multi statement switch body they're going to forget to write the arrow because they have muscle memory from writing functions. But it does make it very clear that the following block evaluates.

  2. TBD? I can edit this original post to add more proposals if they come in.

All of the above have downsides, but I guess I prefer 1?

One advantage of all the above options is that you can use empty curly braces to signify the empty case. Using 1 as an example:

switch (someInt) {
  0 {
    var record = await asyncGetRecord();
    record.$1;
  },
  // Evaluates to void. Gives you the ability to explicitly opt out
  // of exhaustivity.
  _ {},
}

There's some debate already on how to handle switch elements (eg a switch statement inside a list literal). These do not need to be exhaustive but we now have a conundrum:

  1. Switch statements need not be exhaustive, but they don't evaluate to anything so we can't use them
  2. Switch expressions evaluate, but need to be exhaustive so we could only use them if we had special case non-exhaustivity I don't know if I love the following as it's awkward/verbose, but allowing {} for the empty case squares this circle:
    var myList = [
    // This is exhaustive, but only cases that evaluate to a non-void
    // value are included in the list.
    switch (someInt) {
    // Not included.
    0 {},
    // Same.
    1 => print('foo'),
    // Included. Note could also be an iterable preceeded by the spread operator.
    var i => MyObject(i),
    }
    ];

Cons

  1. Things that evaluate should not have side effects. The restrictive nature of the expression syntax (namely single statement bodies) makes it much harder to shoot yourself in the foot with side effects.

While this is a reasonable and common philosophy, I think it'd be uncharacteristically opinionated for Dart to enforce it here, especially given that even pure switch bodies without side effects stand to benefit from multi statement bodies (see Why #1).

lrhn commented 1 year ago

This is basically a request for "expressions statements" #132, which allows statements inside an expression, which still evaluates to a value. There is no reason to limit that to switch expression case bodies, it's useful in other places too.

I think the closest existing issue is https://github.com/dart-lang/language/issues/132. (There's a lot of similar requests, like https://github.com/dart-lang/language/issues/1211, https://github.com/dart-lang/language/issues/2025, https://github.com/dart-lang/sdk/issues/55).

I'm going to close this as a duplicate of #132, because I think it's more likely that we'll get the general feature, than a limited feature only for switch expressions. I do encourage following up on #132 with the insights from here.

munificent commented 1 year ago

This is basically a request for "expressions statements" #132, which allows statements inside an expression, which still evaluates to a value.

Note that this isn't entirely a request for block expressions. In the suggested syntax:

switch (someInt) {
  0 {
    var record = await asyncGetRecord();
    record.$1;
  },
  // Evaluates to void. Gives you the ability to explicitly opt out
  // of exhaustivity.
  _ {},
}

Note how there's no => before the {. That means this isn't just a regular switch expression whose case expressions happen to be block expressions.

I'm going to re-open this because I could potentially see us allow block bodied switch expression/statement things without going all the way to general block expressions.

caseycrogers commented 1 year ago

I definitely prefer the syntax here that mirrors functions: => for single statement, { ... } with no arrow is for multi statement. But I'll take what I can get so I'd take either!

If I were Dart czar and could make whatever I wanted happen WRT this constellation of interrelated ideas, this is what feels cleanest to me:

  1. if..else statements are made expressions that evaluate to the last line in the matched block (many of the "do" examples in #132 are examples that would be met by this)
  2. Switch expressions allow multi statement bodies using the curly brace syntax outlined here.
  3. do expressions (regardless of exact syntax) are not supported

My reasoning is as follows: 1 and 2 seem to cover most of the places where one would want to have a multi statement expression but can't today rendering do expressions low value and more added language complexity than they're worth. Though others more accustomed to do expressions may feel differently-I'd love to see more examples (though probably in #132 so as not to get too off topic here).

FWIW, I've started using IIFE's in my switch expressions. They feel a bit hacky/gross and they shadow return, continue etc, but I'm finding them to work pretty well so I'd be satisfied enough with an approach where IIFEs just become increasingly seen as idiomatic Dart (maybe even with IDE shortcuts) and the correct solution in this situation. Is it how one would design Dart if it were being done from scratch today? Probably not. Does it work without major disruptions to the language? Yeah.

nex3 commented 1 year ago
  • Oh wait, I need multi-statement bodies, better use statement syntax with return inside a helper function ^this is a really annoying refactor

This particular situation has come up a lot for me when converting Sass to Dart 3 style. I'll have a nice lovely switch expression going, and then I'll realize that one or two cases need intermediate variables or an assert() or something like that, and either have to do something kind of gross like an IIFE or refactor the whole thing to be a statement instead. This is particularly painful given that the formatter uses twice as many lines for a switch statement with terse return conditions as it does for a switch expression.

lrhn commented 1 year ago

I simply do not believe that a "sequence of statements with an expression at the end" as a single level exception inside switch expressions will be enough.

It will take approximately eight seconds from us releasing that to someone saying,

But my switch expression case's statement-expression-block ends with an if statement, where each branch does more statements and then gives different values. Can't we allow nesting these statement-expressions?

And sure, we can tell them to do:

switch (e) {
  case p {
    var result;
    stmt;
    stmt;
    if (condition) {
      stmt;
      result = e1;
    } else {
      stmt;
      result = e2;
    }
    result
 }

But they'll complain that that looks stupid and unnecessary. And they'd be right. And then they'll just use an IIFE with returns.

Or someone wants to break out early with a result, instead doing the rest of the computation if some value is null. And they'll, rightly, use an IIFE with returns too.

Statements inside expressions is a very useful feature in many different places, not just here. And I don't want to give people half of a real feature, or less, and lock us into a design that's not entirely thought through.

If we can't even reach feature parity with IIFE's, I'm not ready to call it good enough.

caseycrogers commented 1 year ago

Others in the hope for do... thread showed examples where the expression block evaluates to the last line, including potentially matched if/else cases.

eg:

switch (e) {
  // evaluates to `e1`.
  case a {
    e0;
    if (true) {
      e1;
    } else if (false) {
      e2;
    },
  // evaluates to `e2`.
  case b {
    e0;
    if (false) {
      e1;
    } else if (true) {
      e2;
    },
  // evaluates to `e0`.
  case b {
    e0;
    if (false) {
      e1;
    } else if (false) {
      e2;
    },
 };

That could be applied here for multi-statement switches. That said, I really don't love this as it's very subtle/implicit behavior.

Or someone wants to break out early with a result, instead doing the rest of the computation if some value is null. And they'll, rightly, use an IIFE with returns too.

There's an unavoidable tradeoff here. You can't let return short circuit the case without shadowing return to short circuit the function enclosing the switch. So the person requesting this could be rejected on the grounds that the tradeoff isn't worth it (which I believe even outside of my admittedly contrived hypo here).

Generally, I've come around to the idea that if Dart is going to have multi-statement bodies in switch expressions, it probably shouldn't do so only in switches as that just creates an inconsistent language. If a switch is multi statement why not an if {...} etc? My dream world is one where Dart 4.0 is expression oriented where every code block regardless of context is implicitly an expression, but I won't hold my breath on that one.

In the meantime, I'm reasonably sold on IIFE's and have been happily using them in my projects. There's nothing stopping anyone from using them today, but they feel very "icky" at first glance. This is drifting fairly far off topic so I'm happy to file a new issue about it if that makes sense, but do we want to/can we make them feel more idiomatic? Are there clear levers to pull to do so? Like linter changes that would encourage them in some contexts? Using them where applicable in prominent Dart repos (ie Flutter)? Including an example in the Dart docs for writing a multi-statement switch expression using an IIFE and recommending it over converting to a statement with return (possibly nested in a helper function)?

My one big gripe with IIFE's in this scenario is that the syntax is pretty subtle. You can't tell that a function expression is immediately invoked until you've finished reading it's body. In some contexts this could create a pretty nasty bug:

// Switch expression used as a statement.
return switch (e) {
  case a: print('foo'),
  case b: () {
    print('foobar');
    print('baz');
  }, // <--- forgot to invoke, prints are never executed
}

I guess you could say this is my fault for using a switch expression as a statement when a switch statement would work, but to that my answer is: https://github.com/dart-lang/language/issues/3061

lrhn commented 1 year ago

Here's a pitch (it's not amazing, but it's not completely stupid either):

A do expression block, which is an expression which allows statements, and then evaluates to a value.

<expression> ::= ...
   | <label>? 'do' <statementExpressionBlock>

<statementExpressionBlock> ::= '{' <statement>* <expression>? '}'

The <expression> is required if control flow can reach the end of the block and a value is actually needed.

That's combined with a "break-with-value", and making all control flow operators expressions:

<controlFlowExpression> ::=
    'return' <expression>?
  | 'continue' <identifier>?
  | 'break' <identifier>? ('(' <expression> ')')?
  | 'throw' <expression>           -- was already an expression, but still is.
  | 'rethrow'

If the break targets a do-block or another breakable expression, then the (expression) allows it to provide a value (which is needed if the expression needs a value).

Then you can use break(value) to exit the closest enclosing do { ... } expression block with a value, and break label(value) to break from a deeper nesting. It's like a local return from the do-expression. (Not great syntax. All the good characters are taken.)

The expression at the end of the do { ... } block can be omitted if:

Similarly a break of a do-block needs a value unless the do-block is in a void context.

That allows you to write:

  var lastIfAny = do { var tmp = this._tmp; if (tmp == null) break(null); while (tmp.next != null) tmp = tmp.next; tmp };

And then we can also make switch expression case bodies be:

  <switchExpressionCaseBody> ::= '=>' <expression> | <statementExpressionBlock>

and let the default target of the break-with-label inside that <statementExpressionBlock> (or <expression>) be the surrounding switch.

Alternative syntax for break-with-value could be ^value or ^label:value. Then I'd drop the final expression from the do-block and require you to write ^expression; at the end:

  var lastIfAny = do { var tmp = this._tmp; if (tmp == null) ^null; while (tmp.next != null) tmp = tmp.next; ^tmp; };
  var lastIfAny = switch (this._tmp) {
     null => null
     var cursor {
       while (cursor.next != null) cursor = cursor.next;
       ^cursor
     }
 };

At that point <statementExpressionBlock> is just <blockStatement>. But the <pattern> <blockStatement> sequence might not be that readable.

(And you still have to write ; after a }, which is going to be very, very easy to forget.

rubenferreira97 commented 1 year ago

If we rewrite the original do and switch examples as follows:

var lastIfAny = do {
    var tmp = this._tmp;
    if (tmp == null) break(null);
    while (tmp.next != null)
      tmp = tmp.next;
    tmp
  };

var lastIfAny = switch (this._tmp) {
     null => null,
     var cursor {
       while (cursor.next != null)
         cursor = cursor.next;
       cursor
     }
  };

In my opinion, it's quite readable and a neat idea. I understand that it's common for an expression-based language to evaluate the last expression as an exit/return. However, in Dart, I believe that using a reserved keyword (break, ^, or something else) would be more in harmony with the rest of the language.

I have a suggestion for discussion (which might be considered as a somewhat foolish idea): How about treating labels as names that can only be defined once? This change, although subtle, would be breaking.

outerLoop:
for (int i = 1; i <= 3; i++) {
  final outerLoop = 0; // The name 'outerLoop' is already defined.
  print('Iteration $i');
  if (i == 2) break outerLoop;
}

With this, I think we can replace break(value); with break value;, break label(value); with break label:value;, and still maintain the existing break label; syntax, thus avoiding the introduction of a new break syntax for developers (but maybe we want a new syntax for a more explicit meaning?). A downside I see is that the parser would need to distinguish between labels and values in break label|value;. Here's the revised version:

var lastIfAny = do {
    var tmp = this._tmp;
    if (tmp == null) break null;
    while (tmp.next != null)
      tmp = tmp.next;
    break tmp;
  };

var lastIfAny = switch (this._tmp) {
     null => null,
     var cursor {
       while (cursor.next != null)
         cursor = cursor.next;
       break cursor;
     }
  };

Independently of the syntax I really like @lrhn overall approach because we don't have to write do in the switch expression.

munificent commented 1 year ago

This particular situation has come up a lot for me when converting Sass to Dart 3 style. I'll have a nice lovely switch expression going, and then I'll realize that one or two cases need intermediate variables or an assert() or something like that, and either have to do something kind of gross like an IIFE or refactor the whole thing to be a statement instead.

I run into this all the time too.

This is particularly painful given that the formatter uses twice as many lines for a switch statement with terse return conditions as it does for a switch expression.

I did add support to the formatter to collapse switch cases to a single line if they could all be collapsed. I implemented that exactly because of this concern: I didn't want people to feel compelled to use switch expressions just because they're vertically shorter.

But it was a high-churn change with some performance issues. We had enough going on getting Dart 3.0 out the door that I backed out the change. I would like to revisit it again at some point.

lrhn commented 1 year ago

(Can I haz void doFoo() { singleStatementFitsOneLine; } without the newlines too, please? I really don't like using => with void, and fitting one one line is the only reason to use => for that!)

lrhn commented 11 months ago

How about treating labels as names that can only be defined once?

Yes, please. I've considered that before. It's definitely (potentially) breaking, but might be worth it.

If the labels exist in the same scope as variables, then there is no ambiguity in break foo;, we know, by identifier resolution, whether it refers to a label or a value. (Code can still be written so obscurely that it's hard to read, and a determined Fortran programmer can write Fortran in any language. As long as it's possible, preferably easy, to write readable code, that should solve itself.)

munificent commented 10 months ago

I really don't like using => with void, and fitting one one line is the only reason to use => for that!)

For what it's worth, my impression is that, whether or not we pack single statement blocks onto one line, many users just aesthetically like the way => ... looks for short void-bodied members.

lrhn commented 10 months ago

If we disallow labels in switches, maybe we could just allow you to omit case from statement switches too.

Might require some lookahead to see if something is followed by :, and is therefore a pattern, but it's precisely the kind of lookahead we're already good at (find matching closing delimiter).

switch (e) {
  int x:
  int():
  int? z : y; // damn!
}

Nope, the ?/: conditional expression gets in the way again. Let's remove that too, making it an if expression.

(Or say that an expression statement cannot start with a conditional expression or function literal. Because the latter prevents (var x, var y) => e; from being a case-less switch in a switch statement too. The more I think about this, the more I like it. Using expression forms where a statement is allowed is bad style, so you should never need to start an expression-statement with an unparenthesized function literal or conditional expression. Not even a (){body}();, which seems to be the usual IIFE pattern, because you can just do {body} as a statement.)