dart-lang / language

Design of the Dart language
Other
2.66k stars 205 forks source link

loop expressions / collection-for alternative #4142

Open hydro63 opened 2 days ago

hydro63 commented 2 days ago

Motivation / Inspiration

This proposal is proposes a sort of alternative / anti-thesis to the current collection-for we have, it would allow the collection-for to be more powerful and expressive, and it would solve almost all the proposals asking for extending the collection-for. The inspiration / motivation comes from #4139. I've also searched for similar proposals, because i was sure there had to be one already, but there was none exactly to my liking. The most similar proposal i found is #1633. If there is some proposal i've overlooked, please point me to it.

Proposal

The proposal is straightforward. I propose allowing for / while to be used as expressions. The return value for the loop expression would be Iterable<T> / Stream<T>, which would be lazily evaluated. The values would be passed to the output iterable with a yield keyword. Basically, you can think of it as a modification on proposal #1633, which makes it imo easier and more compact to use.

// example from #4139
// iter -> Iterable<int> = inferred
var iter = for(final item in [1,2,3]){
    if(item < 2) continue;
    doSomething();
    doSomethingElse();
    yield item + 3;
  }

// iter -> Iterable<num> = explicitly typed
var iter = for<num>(final item in [1,2,3]) yield item*2;

// stream -> Stream<int>
var stream = for(final item in [1,2,3]) async {
  await foo;
  yield item * 2; 
}

// embedded into list / list comprehension
final list = [
  firstElement,
  ... for(final item in [1,2,3]){
    if(item < 2) continue;
    doSomething();
    doSomethingElse();
    yield item + 3;
  },
  lastElement,
];

This allows it to be used outside the collections, and allows for easier representation of complex operation on some list.

collection-for and for as expressions interaction

These two paradigms don't mix well / use the same syntax which makes them ambiguous, which is which. That is a big problem, of which i'm aware of. This proposal is more of an antitheses to the collection-for and as such it is a different look at how to do list comprehention. One way to fix the ambiguity is to assume all for is used as expression and just spread the resulting iterable. This could be easily done with a migration script.

// before migration script
final list = [
  firstElement,
  for(final item in [1,2,3]) item + 3,
  lastElement,
];

// after migration script
final list = [
  firstElement,
  ... for(final item in [1,2,3]) yield item + 3,
  lastElement,
];

// possibly the yield keyword could be removed if there is only one statement

Other

I personally think that that for as expressions is a much better way to do list comprehention compared to the python-style collection-for we have today. My main argument for this is that for as expressions allow for a lot more flexibility than collection-for without using any of the usefulness (they can also be easily embedded into collections by spreading).

tatumizer commented 2 days ago

How do you differentiate between for-expression and for-statement inside sync* function? I suggest that you have to add some marker for this. For uniformity, it could look like this:

var x= sync for(int i=0; i<10; i++) yield i*2;
var y = async for(int i=0; i<10; i++) yield (await foo())*2;
hydro63 commented 2 days ago

@tatumizer I don't think for-expression and for-statement is ambiguous. It would be parsed as for-expression everywhere an expression is expected, similarly to switch-expression. The yield in for-expression would be scoped only to the for-expression and would not yield value from the parent generator.

// (0, 1, 2, 3, 4, 2, 550)
Iterable<int> gen() sync* {
  for(int i=0; i<5; i++){
    yield i;
  }

  yield 2;
  yield 550;
}

// the for-expression yield doens't yield to main function, only the resulting for iterator
// (2, 550, 0)
Iterable<int> gen() sync* {
  var iter = for(int i=0; i<5; i++){
    yield i;
  };

  yield 2;
  yield 550;
  yield iter.first;
}

If you are talking about making it more readable and accessable to the users, it is also possible to add a new keyword pass, that would replace the proposed yield in the for-expression. I think it is also a viable approach, since it would work nicely with block expression, where we would be able to return value from block by passing it.

// pass keyword with for-expression
Iterable<int> gen() sync* {
  var iter = for(int i=0; i<5; i++){
    pass i;
  };

  yield 2;
  yield 550;
}

// pass keyword with block expression
print({
  var a = 5;
  a *= 10;
  pass a;
});

// pass keyword with multistatement switch expression
var value = switch(...){
  // other patterns
  ...

  int age => {
    if(age > 60) pass "Too old";
    if(age < 18) pass  "Too young";
    pass "Good age";
  }
  _ => "Undefined"
}

It's also possible that there is some syntax problem i'm not aware of, and if that's the case, please provide example so i can understand it.

mmcdon20 commented 2 days ago

Technically there is already a way to generate an Iterable or Stream in expression form. You just have to wrap your for loop in an anonymous generator function, and then immediately invoke the function.

So this example

var iter = for(final item in [1,2,3]){
    if(item < 2) continue;
    doSomething();
    doSomethingElse();
    yield item + 3;
  }

can be written in current dart as

var iter = () sync* {
  for (final item in [1, 2, 3]) {
    if (item < 2) continue;
    doSomething();
    doSomethingElse();
    yield item + 3;
  }
}();
hydro63 commented 2 days ago

I'm aware that i can do that (it was mentioned in #1633, if i remember correctly), but i don't really like the syntax. It's sort of unintuitive to read, and also clunky (you have to wrap it inside a generator, and call it).

I understand that it's more or less just a sugar, and it isn't really needed, since it doesn't provide any new functionality, but i'd argue it does. Currently, i'd say the reason why people never do it, despite the benefits over collection-for, is that it is really clunky to do it. I think that if it were implemented, it would be used at least somewhat frequently.

Still, that's just my biased opinion, but imo the existence of the generator way does not meaningfully justify disregarding this proposal.

tatumizer commented 2 days ago

@hydro63: interpretation of expression should not depend on whether its value is assigned to some variable or not. E.g. those are both valid expressions:

var x = a > b ? a : b;
a > b ? a: b;

Dart's "switch expression" is not a typical "expression" - it's basically a hack; the support of two totally different syntactic forms of switch is one of the most unfortunate parts of the language IMO.

hydro63 commented 1 day ago

@tatumizer that's why i've provided another way, where the yield keyword would be replaced by pass keyword so that it wouldn't cause any problems

I know that making a new keyword is a last resort, but that's why i've provided other uses for the new keyword, where it could be used in other proposals. Also, i've found another proposal where it could be used to make a feature less confusing - #4141 with a return pattern.

var x = foo case Foo(:String prop && pass prop) ||(String() && pass);

Still, i know that's not perfect, but since this feature is more or less a syntactic sugar, i don't want to compromise on the ease of writing it, if possible. The sync and async keywords bloat the syntax, so i'm not a fan of it.

lrhn commented 1 day ago

This is different from Iterable literals, and probably more viable Iterable literals suffer from being code that is implicitly asynchronous, it needs to be able to suspend at each yield, but it looks like normal code. (So does sync*, but it's delimited by a function body. Iterable expressions are in the middle of other expressions. They need serious synthetic help to be viable, to the point where (() sync* {...}()) doesn't feel like that bad an alternative.)

This proposal would run synchronously and emit all the elements while building the collection literal. But it's really just a way to end the imperative way to build a collection in the middle of the declarative element notation. The "statements + yield" could be written as actual statements + collection.add. The only thing you gain by doing it inline is that you can write more normal elements afterwards and that you can write statements inside expressions.

The latter is something you can't do today. It would effectively allow

[sync* {
  Any statements whatsoever, with no yield; 
  yield value;
].last)

to execute any statements (except yield for an outlet function) inside an expression, and evaluate to a value.

I think we might want to consider the consequences of that. (What if it does return, break or continue, assuming we haven't already made those expressions?)

So let's look at it from that perspective instead: of we have a functionality to allow statements inside an expression which evaluates to a single value, should that be extended to allow statements inside an element that emotes zero or more elements. I wouldn't do only the latter, and the two functionalities should be designed together.

hydro63 commented 1 day ago

This proposal would run synchronously and emit all the elements while building the collection literal

That is possible, and probably the best way to do it, since i suppose it would see most use with flutter.

(What if it does return, break or continue, assuming we haven't already made those expressions?)

In my mind, continue and break work the same as with any loop, and would not emit value. continue would just continue with next iteration, and the break would break out of the loop and end evaluating the iterable. return could either return from the parent function, or be disallowed. I personally am for disallowing return in the iterable literal altogether.

I wouldn't do only the latter, and the two functionalities should be designed together.

You are right, that statements in expressions should be designed together, if this were to be implemented, but AFAIK that is the blocked expression proposal.

Also, i'm starting to realise the limitations of the purely for syntax that i proposed, and i'm starting to warm up to the sync* for syntax @tatumizer proposed.

tatumizer commented 1 day ago

What if it does return, break or continue...

An interesting point about return. Inside sync* function, return means "no more values" If we re-make some sync* function into a sync* for block, as proposed, return should preserve that meaning. Which precludes the interpretation of "return" as "return from the containing function". This, in turn, means that in a hypothetical block-expression (discussed elsewhere) we must also preserve that meaning.

f() {
  var a = do {
     //...
     return 0; // return from the do-block, not from f
  }
}

As you remember, the controversy about the treatment of return has been a major stumbling block in the discussion of block expressions (pun not intended).

@lrhn: could you warm up to an idea of return by default meaning "return from the block" now as we have a strong argument for it? There's an option of using another form of "labelled return" like return f:0, break myLabel etc, but inside block-expressions of any kind (do, for, whatever), the label should be mandatory whenever we refer to the code outside the current block

(Expression block can take several forms: do {...}, sync* for, async for, do for-else, do async for-else etc).