Abbreviated function literals

eernstg commented 5 years ago

This is a proposal to add support for abbreviated function literals to Dart.

It is based on discussions about anonymous methods, and may be used in combination with a pipe operator to achieve expressions of a similar form as an anonymous method. A similar mechanism exists in Kotlin.

[Edit April 24, 2019: Spelled out the design choice of naming the implicit parameter this. May 13th: Removed parts about the name this and introduced use of context type; see this comment for more details.]

Overview

Function literals are used frequently in Dart, and many of them declare exactly zero or one required parameter, and no optional parameters. The declaration of such a parameter can be omitted if a standard name is chosen for it, and if a suitable parameter type can be obtained from type inference.

Syntactically, a block function literal with no parameters already has the form () { ... }, so we cannot just omit the parameter declaration in order to obtain an abbreviated declaration. A plain <block> could be used (that is { ... } containing statements). This creates some parsing ambiguities, but they are not hard to resolve (as specified below).

It is already a compile-time error to have an expression statement starting with {, so there is no ambiguity for a block in a sequence of statements: That is just a regular block, that is, some more statements in a nested static scope.

For function literals of the form (someName) => e, we can use the abbreviation => e, provided that someName is the name which is used for a parameter which is declared implicitly. In this case there are no syntactic conflicts. Similarly, we can abbreviate () => e to => e. With both abbreviations we need to disambiguate the form with zero parameters and the one with one parameter, but we can do this in a way which is already used in many situations in Dart: Based on the context type.

Following Kotlin and using it as the name of implicitly declared parameters, here are some examples:

main() {
  var xs = [1, 2, 3];

  // Current form.
  xs.forEach((x) { print(x); });
  // Proposed abbreviation.
  xs.forEach({ print(it); });
  xs.forEach(=> print(it));

  // Current form.
  var callbacks1 = [() { doA(); }, () => doB()];
  // Proposed abbreviation.
  var callbacks2 = [{ doA(); }, => doB()];
}

Asynchronous and generator variants can be expressed by adding the relevant keyword in front, e.g., [2, 3].map(sync* { yield it; yield it; }), which will evaluate to an Iterable<Iterable<int>> that would print as ((2, 2), (3, 3)).

Syntax

The grammar is adjusted as follows in order to support abbreviated function literals:

<functionExpression> ::=
    <formalParameterPart>? <functionExpressionBody>

<functionExpressionWithoutCascade> ::=
    <formalParameterPart>? <functionExpressionWithoutCascadeBody>

<functionPrimary> ::=
    <formalParameterPart> <functionPrimaryBody>
  | ('sync' '*' | 'async' '*')? <nonEmptyBlock>

<nonEmptyBlock> ::=
    '{' <statement> <statements> '}'

We insist that a block which is used to specify a function literal cannot be empty. This resolves the ambiguity with set and map literals.

Static Analysis and Dynamic Semantics

Let e be an expression of the form <nonEmptyBlock> that occurs such that it has a context type of the form T Function(S) for some types T and S; e is then treated as (it) e. Similarly, a term B of the form <functionExpressionBody> or <functionExpressionWithoutCascadeBody> with such a context type is treated as (it) B.

Let e be an expression of the form <nonEmptyBlock> that occurs such that does not have a context type of the form T Function(S) for any type T and S; e is then treated as () e. Similarly, a term B of the form <functionExpressionBody> or <functionExpressionWithoutCascadeBody> with such a context type is treated as () B.

This determines the static analysis and type inference, as well as the dynamic semantics.

Discussion

This is a tiny piece of syntactic sugar, but it might be justified by (1) the expected widespread usage, and (2) the standardization effect (which allows developers to read a function at a glance because the chosen parameter name is immediately recognized).

An argument against having this abbreviation is that it creates yet another meaning for an already very frequently seen construct, the { ... } block. The usage of => e might be less confusing in this respect, because the token => already serves to indicate that "this is a function literal".

The proposal makes {...} mean () {...} in the case when there is no context type or only a loose one like dynamic, and it only means (it) {...} when the context type is a function type with one positional parameter. We could easily have chosen the opposite, but the given choice is motivated by the typing properties:

If we make the opposite choice (such that {...} "by default" means (it) {...}), and the context type is Function or any top type (e.g., dynamic) then the parameter it will get the type dynamic, and this is likely to introduce a large amount of dynamic typing in the body, silently.

When the context type doesn't match any of the cases mentioned above we will have a compile-time error. In that case the error message should explicitly say something like "the value expected here cannot be an abbreviated function literal". We may be able to say that the desugaring is undefined in this case, but it seems more practical to decide that {...} is desugared to a specific term with a specific type, such that we can emit the normal "isn't assignable to" error message as well.

We could support some other argument list shapes. For example, {...} could mean ({int a, double b}) {...} when the context type is T Function({int a, double b}), and we could in general handle named parameters, plus zero or one positional parameter (named it). However, it probably wouldn't be very easy to understand such a function body, because the declaration of the named parameters are not shown anywhere locally. Hence, no such mechanisms are included in this proposal.

Another thing to keep in mind is that it might be somewhat tricky to see exactly where any given occurrence of the identifier which is the name of the implicitly declared formal parameter is declared:

import 'lib.dart'; // Assume that `lib.dart` declares something named `it`.

main() {
  print({ it }); // Set literal containing `it` declared in 'lib.dart'.
  print({ it; }); // Function literal containing a no-op evaluation of its argument.
}

We believe that this will not be a serious problem in practice, because the widespread use of the implicit parameter name it will make it obvious that it is a really bad idea to declare the same name globally.

It may be considered confusing to have modifiers like async as specified. This is a completely optional part of the proposal, and we could just take it out. Developers would then have to write an explicit parameter part in the case where they want to use an asynchronous function or a generator function.

Revisiting the request in #259 and the examples (version 1, 2, and 3) in there, we could express a solution similar to version 4, #260, using abbreviated function literals and the pipe operator (#43) as follows:

// Variant of version 4, #260, using abbreviated function literals and the pipe operator.

void beginFrame(Duration timeStamp) {
  // ...
  ui.ParagraphBuilder(
    ui.ParagraphStyle(textDirection: ui.TextDirection.ltr),
  ) -> {
    it.addText('Hello, world.');
    it.build() -> {
      it.layout(ui.ParagraphConstraints(width: logicalSize.width));
      canvas.drawParagraph(it, ui.Offset(...));
    };
  };
  ui.SceneBuilder() -> {
    it.pushClipRect(physicalBounds);
    it.addPicture(ui.Offset.zero, picture);
    it.pop();
    ui.window.render(it.build());
  };
}

eernstg commented 5 years ago

@tatumizer wrote:

One unfortunate thing about dart blocks right now

Right, it is a potential source of confusion that return in a plain block returns from the enclosing function, and return in a function literal is a return from that nested function. But I would actually expect those plain blocks to be rather rare.

There is no clash in the technical sense, because it is well-defined when { <statements> } is a plain static block, and when it is a function literal: The former can only occur as a statement, and the latter can never occur as a statement, only as an expression (and it cannot occur as an <expressionStatement> because they can't start with {).

eernstg commented 5 years ago

Negotiating with @lrhn is an ongoing effort. ;-)

eernstg commented 5 years ago

After a lot of discussions about the use of the name this for the implicit parameter, I've concluded that it it likely to reduce the comprehensibility and readability of the code, so I've removed the parts of the proposal where this idea was discussed.

The most obvious choice for the name of the implicit parameter is then it, so that's what the proposal uses now.

On the other hand, @lrhn suggested that we could use the context type to offer two variants: When the context type is a function that takes no arguments a <nonEmptyBlock> b could desugar to () b, and otherwise it desugars to (it) b. I've included that idea as well, it seems to be pretty well aligned with other mechanisms in Dart, and useful.

lrhn commented 3 years ago

Here's a new idea: Any undeclared identifier of the regexp-form \$+\d+ is implicitly a parameter of a surrounding no-params-specified function. I still want you to write => to specify that it's a function (and where it's delimited) because Dart syntax is very hard to parse without guides. If you nest functions without parameters, you need to write $$ and $$$ as prefix, depending on the nesting.

Then list.sort(=>$1-$2) would be a two-argument function because it contains $1 and $2 which are not otherwise declared, and list.fold(=>$2) is ($1, $2) => $2. (We can also allow inference to add more parameters than you specify, if it's needed by the context, so list.fold(=>$1) is ($1, _) => $1.

For nested functions, you need to add extra $s: =>=>$1+$$1 is ($1) => ($$1) => $1 + $$1`.

We can allow you to omit the digit if there is only one parameter, so the above can be =>=>$ + $$. We can allow you to write $foo to designate a named parameter named foo, so =>$foo is ({foo}) => foo. (We remove the $s from the name when you call the function, but internally we ensure disambiguation, so =>=>$foo+$$foo is ({foo}) => let $foo = foo in ({foo}) => let $$foo = foo in $foo + $$foo.

This makes simple things wasy: =>print($), =>$1-$2, =>"${$}" (OK, that one sucks a little). It makes more complicated things possible: =>42, listOfLists.map(=>$.map(=>$$.toString()).join("")).join("").

We can then still consider allowing trailing function literals: When a function invocation's last argument is a function literal, foo(a1, a2, (...) { ... }), you can write it as foo(a1, a2) { ... } instead. Maybe even allow you to drop the trailing semicolon if the expression has type void (or require it to have type void, since it looks like a loop constructor which doesn't have a value - yet).

Then you can do:

  list.forEach() {
    print($1);
  }

eernstg commented 3 years ago

I like the $1 .. $k and $named idea! It allows us to rely on a larger set of context types, and it reads quite reasonably.

For the trailing function literals, we still have the issue that foo(a1, a2) { ... } is a local function declaration, and we would presumably have to parse some tokens following } in order to see whether it is actually a local function or it is an expression, e.g., foo(a1, a2) {...}; or foo(a1, a2) {...} - 1.

ykmnkmi commented 3 years ago

@eernstg what about .1 .. .2 and .named? Of course, this may conflict with enums in the future.

eernstg commented 3 years ago

Right, and it may also conflict with another kind of concise function literal: myList.forEach(.toString()) meaning myList.forEach((x) => x.toString()). (That idea might not work at all, nobody has spelled out the details as far as I know, but it's still worth keeping in mind that every new kind of syntax precludes some other syntactic extensions that would look the same).

lrhn commented 3 years ago

I don't think I'll ever accept a syntax for functions which does not have a delimiter. The myList.forEach(.toString()) is too ambiguous. Take myListOfLists.forEach(.map(.toString())), how will that be parsed - and why? So, not particularly worried about breaking that.

It's true that simply doing foo(a, b) { body; } is ambiguous grammar, and I definitely don't want a trailing semicolon to be the decider, that's just unreadable and easy to get wrong. We might need more punctuation. ("Punctuation, is? fun!")

As for .1, .2 ... those are currently valid double literals, and even if not, the syntax makes me think of member access, and there is no object to access (except some implicit "parameter object").

Levi-Lesches commented 3 years ago

=>=>$1+$$1

=>=>$foo+$$foo

=>print($)

=>"${$}"

listOfLists.map(=>$.map(=>$$.toString()).join("")).join("")

Just to keep perspective here, this can get out of hand very quickly. Closures themselves are syntax sugar for regular functions, and now we're into sugar-for-sugar territory. I think as closures get more complicated (and nested), we should encourage devs to break them out into simple, separate functions.

Here's an example where the readability is about the same, but closures make it very clear what's happening:

void printNested(List<List> list) {
  list.forEach(=>=>print($$));
  // vs
  list.forEach(=>$.forEach(print));
  // vs
  list.forEach( (List sublist) => sublist.forEach(print) );
}

void main() {
  List<List> list = [[1, 2, 3], [4, 5, 6]];
  printNested(list);  // each number on its own line
}

And here's an example of a separate function being more readable than all the alternatives:

String groupNested1(List<List> list) => list
  .map(=>$.map(=>$$.toString().join("")))
  .join(", ");

// vs

String groupNested2(List<List> list) => list.map(
  (List sublist) => sublist.map( 
    (value) => value.toString()
  ).join("")
).join(", ");

// vs

String groupNested3(List<List> list) => [
  for (final List sublist in list) [
    for (final int value in sublist)
      value.toString()
  ].join("")
].join(", ");

// vs

String toString(Object obj) => obj.toString();
String groupList(List list) => list.map(toString).join("");
String groupNested4(List<List> list) => list.map(groupList).join(", ");

// --------

void main() {
  var list = [[1, 2, 3], [4, 5, 6]];
  print(groupNested(list));  // 123, 456
}

gosoccerboy5 commented 2 years ago

Is removing the calling parentheses when an abbreviated function literal being considered?

myList.forEach { it.foo(); };

lrhn commented 2 years ago

@gosoccerboy5 It's a feature of other languages that we are aware of. Whether it's a good fit for Dart depends on whether we can fit it into the existing syntax. If we can, that would be great.

eernstg commented 2 years ago

I think I'll give a tiny bit more context. This is about syntax like myWhile (someCondition) {/*myLambda, playing the role of a while body*/}, and all the other variants where the point is that we allow the last parameter to be passed outside the (...) (and we may allow the () to be emitted entirely if there is only that last parameter), when the last parameter is of the form {...}.

The problem with that syntax in Dart is that it does not use a consistent syntactic pattern where declarations start with a reserved word. (For instance, Scala does that, so they use var and val to introduce mutable/immutable variables, and it's always easy for the parser to recognize a variable declaration from the very first token).

So if we see myWhile(b) { doSomething(); } then it could be an invocation of a function named myWhile with the arguments b and { doSomething(); }, and it could also be a declaration of a local function (whose parameter doesn't have a declared type).

This means that it isn't something that just works.

gosoccerboy5 commented 2 years ago

.... This means that it isn't something that just works.

Oh.

Wdestroier commented 1 year ago

Could `expr` (an expression inside backticks) be a syntax sugar for (it) => expr? Example:

xs.forEach(`print(it)`);

eernstg commented 1 year ago

Could `expr` [mean] (it) => expr?

That could probably work (backticks are currently not used at all in Dart, so they are very easy for a parser to recognize). They wouldn't support nesting, but abbreviated function literals inside abbreviated function literals aren't going to be very useful anyway.

Backticks could be somewhat more concise than braces, because they wouldn't "naturally" imply spaces. Also, the <nonEmptyBlock> that I proposed would contain a list of statements, not an expression, and that would allow us to avoid return in a number of situations. Finally, we should compare these forms with the one that uses a plain =>:

xs.forEach(`print(it)`);
xs.forEach({ print(it); });
xs.forEach(=> print(it));

xs.map(`2 * it`);
xs.map({ return 2 * it; });
xs.map(=> 2 * it);

lrhn commented 1 year ago

I think nested abbreviated function literals can be useful, like listOfLists.forEach(=>it.forEach(=>print("[$it]"))).

(I'm leaning more towards the parameters of the function being derived from the context type, types, positions and names, and treating the parameter list like an implicit record type, so void foo(int Function(int x, {required bool increment}) act) ... would allow foo(=>$1 + (increment ? 1 : -1)). Then we can special case the one-positional-argument case and call it it. And positional record fields should start at $1 for this.)

dart-lang / language