Support concise function literals

eernstg commented 6 years ago

Motivation for concise function literals

The syntax for a function literal includes parentheses and => or parentheses and braces, such that we may specify both the formal parameters and a function body. However, given that inference will frequently obtain type annotations for the parameters from the context, the formal parameter specification often specifies the name only. This means that we may obtain a more concise syntax for function literals if we introduce some level of support for default parameter names.

This might be very convenient, e.g., for reducing xs.map((x) => x.toString()) to xs.map(#.toString()) or even xs.map(.toString()).

Note this thread on dart-language-discuss which is one of the many locations where this discussion has occurred.

Note also that an old language team issue presented several of these ideas (here), but the repository does not currently admit public access.

Just omit the parenthesis

With this, (x) => e could be abbreviated to x => e, and possibly (x) { S } to x { S }. It would only eliminate two characters, but it would generally work everywhere, and hence it might be useful to do independently of the other proposals described below.

Using `#` as a default parameter name

We could let # denote an implicitly declared formal parameter, such that express (x) => x.foo(x.bar) could be abbreviated as #.foo(#.bar). In general, an expression e containing some number of occurrences of # would stand for (x) => [x/#]e where x is a fresh variable name, and [x/#] is the textual substitution operation that replaces all occurrences of # by x in its argument, here: e. (There is no need to worry about variable capture because x is fresh.)

The main issue with this approach is that it is ambiguous: #.foo(#.bar) might mean (x) => x.foo((y) => y.bar) as well as (x) => x.foo(x.bar).

We could resolve the ambiguity in several ways:

Require that the parameter is used exactly once in the body of the function; that is, we can abbreviate (x) => x.foo(42) as #.foo(42), but (x) => x.foo(x.bar) cannot be abbreviated to #.foo(#.bar), that would instead mean (x) => x.foo((y) => y.bar).
Include the braces in the abbreviation, that is, we can abbreviate (x) { f(x); x.g(42); } as { f(#); #.g(42); }. This could create ambiguities with a block of statements if used as an expressionStatement, where { f(#); #.g(42); } would mean { (x) => f(x); (y) => y.g(42); }, but both of these are rather useless (we just create some function objects and discard them), so we may simply be able to make all such nonsense an error.
Include the arrow in the abbreviation, that is, we can abbreviate (x) => x.foo(42) as => #.foo(42) and (x) => x.foo(x.bar) to => #.foo(#.bar).

It might be possible to take this approach with various other characters in addition to #, e.g., @, %, ? were suggested in the above-mentioned dart-language-discuss discussion. Each of them would of course have different implications for the possible choices of grammar, that is, for the syntactic forms of abbreviated function literals that we can allow.

Use a designated identifier as the default parameter name

We might use a regular identifier like _ or it rather than # as the default parameter name, in which case there is no need to change the grammar. The perceived readability of the resulting code might be better or worse. It probably doesn't make much difference when it comes to the implementation effort.

But it might make the change more breaking, because there may be existing code which is then reinterpreted to have a new meaning, e.g., we already have xs.map(_.foo) somewhere, and _ is in scope such that the code works and means the same thing as xs.map((x) => _.foo(x)). In that situation, it would be highly error-prone to give it the new meaning xs.map((x) => x.foo), and it would presumably be a nightmare to try to use rules like "_.foo is desugared to (x) => x.foo if and only if _ is undefined in the current scope".

Otherwise, the ambiguities mentioned for # would apply in this case as well, and the fixes could essentially be reused.

Use the empty string as the default parameter name

This would allow us to abbreviate (x) => x.foo(42) to .foo(42), which might be unambiguous in the grammar, but there are only few expressions where this would work. For instance, we cannot abbreviate (x) => o.foo(x) to o.foo() because that already means something else, which might just as well be the intended meaning.

So this approach might look very attractive with certain examples, but it is unlikely to scale up.

Multiple default parameter names

The above-mentioned dart-language-discuss thread also had several ideas about how to enable functions receiving multiple (positional and required) parameters to be abbreviated.

For instance, $1, $2, ... could be used, or $, $$, ..., or _1, _2, ..., such that (x, y) => x + y could be abbreviated as $1 + $2.

In return for restricting each parameter to occur exactly once and in order, we could also use the same symbol for all parameters, such that (x, y) => x + y could be abbreviated as, for instance, _ + _ or $ + $ .

The former may look somewhat busy, and the latter is certainly rather restrictive, but these ideas can essentially be piled on top of all the previous proposals in order to let them support the multi-argument case.

Use a designated form of identifier as the default parameter names

We could say that $foo is a default parameter name just because it starts with $, and so is $bar. If we thus reserve all identifiers of a specific form as default parameter names, then we can express the situation where different occurrences are the same or not the same parameter, and we can also communicate more clearly what each parameter is intended to mean: (x, y) => x + y could be abbreviated as $x + $y, (x) => x + x could be abbreviated as $ + $ , and:

... myWidgets.map((widget) => widget.getColor).firstWhere((color) => color == Blue) ...

// could be abbreviated into:
... myWidgets.map($widget.getColor).firstWhere($color == Blue) ...

// which may be easier to read than this:
... myWidgets.map(_.getColor).firstWhere(_ == Blue) ...

Claus1 commented 5 years ago

I vote for _

... myWidgets.map(_.getColor).firstWhere(_ == Blue) ...
... myWidgets.map(_.size > 5 && _.width > 2)
numbers.filter(_ > 0 && _ < 10)

It is short and clear)

lrhn commented 5 years ago

I like _ as a placeholder for a single implicit function parameter. The implicitness does mean that we have to implicitly delimit the function body. (That problem is independent of what marker we use for the variable, it comes from not having a => to define the position).

Another option for solving that (apart from the three mentioned above) might be to delimit to the nearest enclosing production of a specific grammar rule. I don't think it will work though.

Something like x.forEach(y.add(_)) contains no syntactic way to distinguish the three options for function-insertion:

x.forEach(y.add((_) => _))
x.forEach((_) => y.add(_))
(_) => x.forEach(y.add(_))

This can only really be resolved using types, and we can't infer types before knowing the expression structure.

I'd propose x => x (allow omitting parentheses for a single parameter) and => _ (no parameters part means an implicit argument name of _, distinct from () => e which is a nullary function).

Claus1 commented 5 years ago

" contains no syntactic way to distinguish the three options " Why? We expect a function in forEach(). If inside simple expression all _ mean its argument. No other way.

mpfaff commented 3 years ago

I'd much prefer using it as the default parameter name. Coming from Kotlin and Rust, I often use _ or _foo to denote unused parameters.

In Kotlin, function literals use the syntax { [params ->] body }, where body can span multiple lines, and params -> may be omitted if the function accepts only a single parameter. If the params are omitted, the single parameter uses the name it.

// single parameter omitted.
foo({ it * 2 })

// single parameter specified.
foo({ x -> x * 2 })

// multiple parameters specified.
foo({ x, y -> x * y })

I find this syntax very concise and easy to read.

mateusfccp commented 3 years ago

@mpfaff said:

I'd much prefer using it as the default parameter name. Coming from Kotlin and Rust, I often use _ or _foo to denote unused parameters.

In Kotlin, function literals use the syntax { [params ->] body }, where body can span multiple lines, and params -> may be omitted if the function accepts only a single parameter. If the params are omitted, the single parameter uses the name it.
// single parameter omitted.
foo({ it * 2 })

// single parameter specified.
foo({ x -> x * 2 })

// multiple parameters specified.
foo({ x, y -> x * y })
I find this syntax very concise and easy to read.

I don't know Kotlin extensively, but I know that in Kotlin we don't have a proper syntax for collection literals. I.e., to make a set we do val numbers = setOf(1, 2, 3, 4). Dart, on the other side, use braces ({}) for both set literals and map literals. They can be disambiguated by the fact that maps MUST have colon (:), so { a } is always a Set<T> with a inside, while { a: b } is always a Map<T, U> with an entry with a as key and b as value.

I think that it would not be trivial to disambiguate the example you gave. { it * 2} is a Set with an element it * 2 or is a function literal which returns the first argument times two?

Maybe the static analysis would be able to infer properly the type based on what the function foo is expecting (if a function, infer the literal as function; if a set, infer the literal as set), but I think it would also be confusing, specially for newcomers, to have an ambiguous syntax for two different kinds of literals.

eernstg commented 3 years ago

Note that this proposal puts a lot of emphasis on avoiding any syntactic top-level structure that unambiguously implies that the term is a function literal. So we're basically just writing an expression containing some special subexpressions (like # or $2), and then it's implied that the expression as a whole is a function literal.

This means that we can have a very concise notation: myList.map((x) => x + 2) can be written as myList.map(# + 2), and myList.fold(0, (previousValue, element) => previousValue + element) could be written as myList.fold(0, $1 + $2).

The other side of the coin is that it is ambiguous how much of an expression we need to include: foo(bar($1 + $2)) could mean foo((a, b) => bar(a + b)) or it could mean foo(bar((a, b) => a + b)).

The proposal about 'abbreviated function literals', #265, is focused on abbreviated forms where this ambiguity does not exist. That proposal covers such cases as => it * 2 meaning (it) => it * 2.

So, @mpfaff and @mateusfccp, it's possible that #265 matches your preferences more directly.

munificent commented 3 years ago

Here's my old proposal from that closed repo:

We could add a new syntax to create a closure with an implicit parameter list. Scala has a neat placeholder syntax for this, but I think it's a little too magical. So how about we say:

You can define a closure using => without a leading parameter list. If you do, the parameter list is inferred based on the _ that appear in the expression. Each one becomes a new positional parameter, in the order it appears in the expression. (This also means you can omit the () when defining a zero-argument closure.)

Here's some comparing the syntax when the receiver is the (possibly implicit) this:

// today                            placeholder params
() => getter                        => getter                        
(value) => setter = value           => setter = _           
() => -this                         => -this                         
(arg) => this + arg                 => this + _                 
(arg) => this[arg]                  => this[_]                  
(arg, value) => this[arg] = value   => this[_] = _   
() => this[expr]                    => this[expr]                    
(value) => this[expr] = value       => this[expr] = _

And when it's a given object:

// today                             placeholder params
() => obj.getter                    => obj.getter                        
(value) => obj.setter = value       => obj.setter = _           
() => -obj                          => -obj                         
(arg) => obj + arg                  => obj + _                 
(arg) => obj[arg]                   => obj[_]                  
(arg, value) => obj[arg] = value    => obj[_] = _   
() => obj[expr]                     => obj[expr]                    
(value) => obj[expr] = value        => obj[expr] = _

Here are some other examples where it would be nice to be able to make a little closure and where the existing tear-off syntax doesn't help because we aren't partially applying the receiver:

connectorRegions[connector].map((region) => merged[region]);
connectorRegions[connector].map(=> merged[_]);

return region.reduce((a, b) => a + b) ~/ region.length;
return region.reduce(=> _ + _) ~/ region.length;

conditions.forEach((condition) => condition.update(action));
conditions.forEach(=> _.update(action));

var openDirs = dirs.where((dir) => _isOpen(hero, dir));
var openDirs = dirs.where(=> _isOpen(hero, _));

return slots.where((item) => item != null).iterator;
return slots.where(=> _ != null).iterator;

game.hero.heroClass.commands.firstWhere((command) => command.canUse(game));
game.hero.heroClass.commands.firstWhere(=> _.canUse(game));

_effects = _effects.where((effect) => effect.update(game)).toList();
_effects = _effects.where(=> _.update(game)).toList();

rules = chunks
    .map((chunk) => chunk.rule)
    .where((rule) => rule != null)
    .toSet()
    .toList(growable: false),
rules = chunks
    .map(=> _.rule)
    .where(=> _ != null)
    .toSet()
    .toList(growable: false),

Looking at this today, I'm iffy about using _ for the magic identifier. Especially if we intend to eventually use it for the non-binding pattern in pattern matching.

lrhn commented 3 years ago

FWIW I have no problem using _ as special marker especially if it is no longer usable as an actual parameter name. (I may have other issues, but that ain't one).

mraleph commented 3 years ago

You can define a closure using => without a leading parameter list.

Why require => and not something like: an expression containing _ is lifted to become a closure with induced parameter list if _ is an undefined symbol in the scope. (Alternatively we can do that with _0, etc).

connectorRegions[connector].map((region) => merged[region]);
connectorRegions[connector].map(merged[_]);

return region.reduce((a, b) => a + b) ~/ region.length;
return region.reduce(_0 + _1) ~/ region.length;

conditions.forEach((condition) => condition.update(action));
conditions.forEach(_.update(action));

var openDirs = dirs.where((dir) => _isOpen(hero, dir));
var openDirs = dirs.where(_isOpen(hero, _));

lrhn commented 3 years ago

See https://github.com/dart-lang/language/issues/8#issuecomment-522013489 for why undelimited "functions" are a problem.

Levi-Lesches commented 3 years ago

@munificent, how would you convert this using _?

(arg) => this [arg] = recompute(arg)

If each _ is assumed to be a different argument:

=> this [_] = recompute(_)
// translates to 
(index, value) => this [index] = recompute(value)
// which is closest to your example of:
(arg, value) => this[arg] = value   => this[_] = _

How about using _, then __, then ___... but that feels like it could get verbose and is so much worse than just using today's syntax.

munificent commented 3 years ago

@munificent, how would you convert this using _?
(arg) => this [arg] = recompute(arg)
If each _ is assumed to be a different argument:

My pitch was that in cases like this, you have to use an explicit => closure and name the parameters. Dart's existing closure syntax is already pretty concise so if you want to layer even more syntactic sugar on top, it's likely that some patterns simply won't fit that even nicer sugar. I'm personally OK with that.

So if you want to support an implicitly named parameter like it (Kotlin) or _, the question is: What does it mean if that implicit parameter appears multiple times? The two options I know of are:

The implicit closure always takes a single parameter and all uses of the implicit name refer to that. You're prevented from using this sugar for multiple-parameter lambdas.
Each use is treated as a new unique parameter, but you are prevented from using the same parameter twice.

My hunch (which I'd want to scrape a corpus to get real data on first) is that 2 is a lot more common than 1. I think it's pretty rare to use the same parameter multiple times, but higher-order functions like reduce(), fold(), and sort() with a comparison callback are pretty common. I'd love it if the sugar supported the latter.

Levi-Lesches commented 3 years ago

I agree that 2 is probably much more common than 1, but I think it's way less intuitive to say "the same identifier can refer to different variables depending on how many times it appears in the code previously". Not to mention the bugs that can arise -- refactoring out one _ will change the meaning of all the other _s.

Personally, I think Dart's closure syntax is currently very clear, readable, and logical, and that should be prioritized. Sugar is nice, but let's keep perspective and make sure the logic stays absolutely clear. Worst case scenario: a closure has to broken out into its own function -- not the end of the world.

lrhn commented 3 years ago

I'll just chime in with another option:

Closures till require =>, but parameters can be omitted. Either just all of them, or possibly also individual ones, like (x,,z)=>
An unbound variable starting with one or more _'s and followed by a number refers to an enclosing closure with omitted parameters. Nested such functions use more _s. Numbers start from 1. A plain _/__/__/etc. is allowed as alternative if _1 would be the only parameter for that closure.

Examples:

map.forEach(=>reversed[_2]=_1)
listOfLists.expand(=>_1..sort(=>__2.compareTo(__1))) // The common flatten reversed sorted operation.

Can also use $ instead of _. If _1 or _ is already bound, and not by a nesting function definition with omitted parameters, where you expect to use it, you can't use implicit parameters. So, don't do that. Using _ ensures that you can't import such a declaration, it's in your own library.

Or #1, ##1 etc, which are currently not used. Read adequately - =>#1 + #2 is "argument number one plus argument number two".

clragon commented 5 months ago

I would prefer not using _ since we are already using _ heavily in a lot of places to mean "not used" in the context of lambdas. This is additionally cemented by the use of the wildcard being _ in patterns.

$n could be more suited which is currently only used in records. #n could also be an idea though I believe that Symbols, which are a type available in the entire language, already use a # to declare themselves.

mateusfccp commented 5 months ago

I would prefer not using _ since we are already using _ heavily in a lot of places to mean "not used" in the context of lambdas. This is additionally cemented by the use of the wildcard being _ in patterns.

Ironically, we are making wildcards first-class in the language, which is probably a requirement for this issue to be implemented with the _ approach.

Personally, I feel more like _ means "I don't care" than "not used", so this would fit perfectly for this case.

clragon commented 5 months ago

Ironically, we are making wildcards first-class in the language

Which I am in favour of! But I thought that would rather make this proposal impossible than enable it.

Have I missed something?

mateusfccp commented 5 months ago

Ironically, we are making wildcards first-class in the language

Which I am in favour of! But I thought that would rather make this proposal impossible than enable it.

Have I missed something?

If I understand correctly, as we currently can assign a value to _ and use it, we would have more ambiguities.

(_) {
  final list = someList.map(_.doSomething()); // Does this _ refer to the value received as argument in the outer function or to the argument passed to map?
}

Although probably most of the cases could be disambiguated by the context type, it could be still confusing.

However, if we have first class wildcard support, in the code above _ could never mean the argument received in the outer function, so it would be unambiguous by default.

I may be wrong, tho, because I didn't read all the details and discussions about wildcards.

clragon commented 5 months ago

that seems to solve the problem of the analyzer/compiler figuring out what is meant, but I was thinking of readability for developers.

after some thought, I can see how "we omit this" can feel intuitive, when applied to this new syntax usage of "we omit this and use its property immediately, so we do not need to name it" but I always thought of underscore as "we omit this and do not use it, so we do not need to name it".

I have no data on how widespread either of these conceptions are, but I would like to propose considering that this syntax might potentially be confusing.

I do like the idea of using $ better, since the language has introduced $ in the context of denoting parameters of records, which we do want to use, and this could make this symbol more intuitive.

lrhn commented 5 months ago

The "invent a name" has only really been defined to work for a single parameter.

If we use the names of the type parameters in the context type, or use the "record field name" of the corresponding argument list as a record ($1, $2, ... for positional, actual names for named) then it can work for any parameter list structure.

That does mean using $1 in the simple one-positional-parameter case.

tatumizer commented 5 months ago

What is the shortcut for 1-parameter function? Suppose we have (with the current syntax) foo((a)=>a+1). What is the new syntax?

lrhn commented 5 months ago

With my suggestion: foo(=> $1 + 1). It's not pretty, but it works for any parameter list and it's consistent with records. (Consistently ugly, but consistent!)

tatumizer commented 5 months ago

Unless $1 already means something in the context:

extension on (int, int) {
  foo() {
    print($1);
    var x = =>$1 + 1; // what $1 is it?
  }
}

($1 is much better than magical it anyway IMO)

Edit: after staring at the expression =>$1 + 1 for a couple of hours, I can't find much of a redeeming quality there. The $1 certainly rhymes with $1 in records, but the whole verse based on this rhyme... needs more work.

The reason the kotlin's syntax (kind of) works in kotlin is that there, the anonymous function has a form of { a , b -> a + b }, which, in case of 1 parameter, downsizes to { a -> a + 1 } and after the next downsize finally reaches { it + 1 } , with it falling from the sky, but the syntax of a function is still recognizable. In dart, the trick doesn't work as naturally.

lrhn commented 5 months ago

var x = =>$1 + 1;

What is the function type of that expression even? There is no context type, so there is no hint what the implicit parameter list should look like.

That leaves two options:

defaulting to zero or one argument. Probably zero. In which case the outer $1 is still in scope.
guessing, based on content. Which probably means finding any identifiers that are not already declared, and introduce parameters for them. Very fragile.
reject the expression, only allow omitted parameters when the correct type implies a function type.

Whether it's the first or last, it wouldn't introduce new names. An implicit parameter list will only introduce new names if it occurs in a context type that tells it which parameters are needed. And then the user should be expecting precisely those parameters.

tatumizer commented 5 months ago

Not everything is lost. The idea can be revived by introducing the syntax :{ $1 + 1 } for function literals. Symbol : in some languages means "quote the following symbol/expression". E.g. in Julia, :(2 + 2) is roughly equivalent to dart's ()=>2+2. Not exactly equivalent, but close (in Julia, it's called a quoted expression - it can be evaluated by eval function).

Let's suppose dart introduces the syntax like :{ $1 + 1 }. Let's call it a "lambda expression". We can treat lambda expressions as a subclass of functions having a specialized syntax. Without a context type, lambda expression is equivalent to no-arg function (it's the same in kotlin: if we write var f = { $1 + 1 }, the compiler will complain about an unknown variable $1).

The syntax should support more than a single expression in the body. A natural idea is to allow :{ stmt; stmt; ... expr } where expr is implicitly returned (there's no semicolon after the expr - for consistency with the single-expression lambda). For an early return, we have to use return v - same as in kotlin.

If we want to invoke the lambda like IIFE, we write $:{ 1 + 1 }. Because : is now redundant, we can simply write it as ${ 1 + 1 } , which is a sought-after block expression. Symbol $ is appropriate here: it's associated with the idea of "evaluation", like in string interpolation. Can it work?

clragon commented 5 months ago

That feels like an uncomfortable amount of new syntax rules and meanings just for this specific feature, in my opinion.

tatumizer commented 5 months ago

For 2 features, another being #3065 Also, everywhere you currently use IIFE, it could be replaced with a nicer syntax. Today, there's a discontinuity in function syntax. You can write (a)=>a+1, but if you want to insert one print statement, you have to replace it with (a) { print(a); return a + 1; }, so 1-expression literal is very different from 2-expression literal. With a new syntax, you write :{ $1 + 1 }, which is more "scalable" in this regard: :{ print($1); $1 + 1 }

tatumizer commented 5 months ago

Q: will it be a breaking change if dart adds support for non-semicolon-terminated expression expr at the end of a function body as an implied "return expr;"? Example:

foo() { 42; } // current return type: Null
foo() { 42; 0 } // return type int (implicit return)

Levi-Lesches commented 5 months ago

Just a note that some of the more common forms of this issue can be solved with #3786:

xs.map((x) => x.toString())
xs.map(X.toString);

myWidgets.map((widget) => widget.getColor).firstWhere((color) => color == Blue)
myWidgets.map(Widget.getColor).firstWhere((color) => color == Blue)

In general, any closure of the form (x) => x.y() or (x) => x.y could be reduced to X.y.

It would only work in cases where you're calling a method with no parameters, or a method/operator with more than one value, but I believe that should already cover quite a lot of cases without introducing ambiguities.

Reprevise commented 2 months ago

I've been writing some C# lately, and when coming back to Dart, I really don't like having to add parenthesis to a single callback param. Writing e => e is a lot faster than (e) => e, nevermind the other proposals contained here.

dart-lang / language