dotnet / csharplang

The official repo for the design of the C# programming language
11.52k stars 1.03k forks source link

[Proposal]: Directly invoked anonymous functions #4748

Open MadsTorgersen opened 3 years ago

MadsTorgersen commented 3 years ago

Directly invoked anonymous functions

Summary

Allow (parenthesized) lambda expressions and anonymous methods to be directly invoked with an argument list:

var s = o switch
{
    Person p => (name => { WriteLine(name); return name.Trim(); })(p.GetName()),
    ...
};

Motivation

Lambda expressions (together with anonymous methods) are also called "anonymous functions". We are doing more and more to make them more similar to named local function declarations: Allowing the static modifier, attributes and explicit return types. One thing we do not yet allow is invoking them directly. For that they still need to be converted to a delegate type.

There are two main motivations to allow this:

  1. As a way to enable statements and scopes in an expression context
  2. To enhance the code environment; notably to make it async

Detailed design

From a grammar perspective direct invocation is already allowed, as long as the lambda expression is parenthesized:

invocation_expression
    : primary_expression '(' argument_list? ')'
    ;

There are currently two kinds of invocation expressions: Method invocations (including extension method invocations) and delegate invocations. This proposal adds "anonymous function invocations" as follows:

For an anonymous function invocation the primary_expression shall be classified as an anonymous function. If the anonymous function is a lambda_expression with an implicit_anonymous_function_signature then the argument_list shall have a corresponding number of arguments and all of these arguments shall have types. These types are then taken to be the types of the corresponding parameters.

Considering the anonymous function to be a function member, it shall be applicable to the argument list.

At runtime an anonymous function invocation is processed as a function member invocation.

A note on parentheses

As mentioned, anonymous function invocations can only parse if the anonymous function expression is parenthesized. While the spec doesn't seem to be entirely clear about this, parenthesized expressions can already be invoked today, and the kind of invocation hinges on the classification of the expression inside the parentheses. Essentially, except for parsing, parentheses are ignored.

This holds even when the parenthesized expression doesn't have a value; e.g. if it is a method group:

void M(int i) => WriteLine(i);

M(1);     // Normal method invocation
(M)(2);   // Disallowed only because of parsing ambiguity with cast expression
((M))(3); // Allowed; parentheses are ignored

We should probably consider making the spec clearer about this, but invocation of parenthesized expressions is nothing new, and is not part of this proposal.

A note on implementation

A directly invoked anonymous function is very much like a local function. We should consider some of the same implementation strategies, e.g. passing as an extra ref argument a struct representing the closure.

Drawbacks

Direct invocation of lambda expressions, even though common in other languages, does not have great readability.

Alternatives

Natural delegate types

One alternative is for this to fall out of the "natural types for lambda expressions" feature. Essentially when a lambda expression has a natural delegate type, we would create the delegate and immediately invoke it.

The downsides are that it would not apply to lambdas without explicit parameter types, and that the semantics require a delegate to be created only to be immediately discarded (though this can likely be optimized away).

Expression blocks

The first scenario - fitting statements into an expression context - could be addressed by a dedicated feature, such as expression blocks. This would require new syntax, and has been the subject of some controversy. It would not address the second scenario; the need to create an async context for code.

Unresolved questions

No known unknowns.

Design meetings

Direct invocation of lambda expressions was discussed in C# LDM on May 10, 2021. https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-28.md#ungrouped

HaloFour commented 3 years ago

Would it be safe to assume that although these expressions use lambda syntax that the compiler can be expected to optimize away the lambda/delegate into something more akin to a local function if not as inline code?

CyrusNajmabadi commented 3 years ago

How is hte signature for the lambda determined here. Given:

(name => { WriteLine(name); return name.Trim(); })(p.GetName()),

it looks like teh lambda is in a location where its type cannot be inferred. Or would we expand the inference locations to allow an 'invocation expression' to be a place where informatino could flow into it? So in this case, we'd see p.GetName() determine taht is a string, and then do lambda inference as if name has the type string?

YairHalberstadt commented 3 years ago

It would not address the second scenario; the need to create an async context for code.

Both rust and Scala allow expressions to be marked as async. I think this could be done fairly straightforwardly in C#. Ideally the expression would be target typed so that you could use ValueTask instead of Task.

The first scenario - fitting statements into an expression context - could be addressed by a dedicated feature, such as expression blocks. This would require new syntax, and has been the subject of some controversy

I would strongly urge the team to prefer expression blocks. I think going with direct lambda invocation risks the same problem as anonymous delegates which were introduced in C#2, weren't brief enough, and were effectively obsoleted by lambdas in C# 3. Whilst directly invoked anonymous functions would scratch the expression blocks itch, it has so much boiler plate that it's unlikely to be acceptable to most users who want it, and could well be replaced by expression blocks a few versions down the line.

CyrusNajmabadi commented 3 years ago

I agree with Yair on this.

HaloFour commented 3 years ago

I would strongly urge the team to prefer expression blocks.

I agree. There may be use cases for IIFE, but I think many scenarios (especially the switch expression scenario presented here) would be much better served with expression blocks or other targeted proposals. Having to use an inline lambda invocation in that case is just really ugly. Having to separately declare arguments (inferred or not) and then pass values to them twists the flow of the code inside out. I imagine if this proposal were adopted you'd be more likely to see the following:

var s = o switch
{
    Person p => (() => {
        WriteLine(p.GetName());
        return p.GetName().Trim();
    })(),
    ...
};

or:

var s = o switch
{
    Person p => (() => {
        var name = p.GetName();
        WriteLine(name);
        return nameTrim();
    })(),
    ...
};

Where all of those extra parens just seem really unnecessary.

jnm2 commented 3 years ago

It was opting out of being async that I was expecting to see listed under the motivations.

async Task FooAsync()
{
    var bar = await BarAsync();

    (() => {
        var refLike = new SomeRefLikeType(bar);
        DoSomething(refLike);
    })();

    await BazAsync();
}

But the ability to use ref-like variables directly inside the async method when they don't cross an await is something I'd rather have anyway.

foxesknow commented 3 years ago

I've got to say, this syntax on this is terrible:

var s = o switch
{
    Person p => (name => { WriteLine(name); return name.Trim(); })(p.GetName()),
    ...
};

This feels overly hacky, trying to synthesize multiple statements in places where only an expression is allowed. Expression blocks would go a long way to fixing this. If it's tricky using braces to mark the blocks then how about something like this:

var s = o switch
{
    Person p => [var name = p.GetName();  WriteLine(name); name.Trim();]
    ...
};

Here's those no need for a returnas the last expression is the result of the block, as in F#. The square brackets give it a Smalltalk sort of feel, which is never a bad thing!

smoothdeveloper commented 3 years ago

As a way to enable statements and scopes in an expression context

When I saw switch expression come up in C# 8, I was disappointed that the branch had to be a single expression (and how hard it is to compose stuff in a single expression in C#), while a scoped anonymous block would have felt natural (being also familiar with F# and not liking python being restricted in what a lambda expression allows).

Wouldn't this concern be addressable by allowing the branch part to have a scoped anonymous block in the first place? what would be the issue or what were the reasons against it? is it because of "early return ambiguity"? what about making last expression the returned one, or worst case, finding a suitable keyword?

var s = o switch
{
    Person p => { var name = p.GetName(); WriteLine(name); return name.Trim(); }
    // ...
};

I'm not against adhoc invocability of a lambda expression, but don't feel the reason I quote @MadsTorgersen for it is good.

I'd rather have the switch expression / switch statement dichotomy subside a bit or totally, because it is frustrating dance when you are used to languages that handle both with same construct, and doesn't force turning the whole code upside down (imagine risk of introducing bugs doing this on a large switch...).

That dichotomy of switch expression doesn't feel natural at all and maybe C# language design team can find a way to solve it without this feature (which, again, is also fine).

Thaina commented 3 years ago

I too am more align of expression block than this feature. But actually I prefer this concept than expression block, just don't like the syntax. Maybe this is worth consider alternative syntax

How about this

var s = p.GetName() into (name) => {
    WriteLine(name);
    return name.Trim();
};

Using into keyword to pass the argument or tuple into delegate argument and would consider invoking that code inline

It actually equivalence to (name => { WriteLine(name); return name.Trim(); })(p.GetName()); and may transform the block into direct code

using at switch

var s = o switch
{
    Person p =>p.GetName() into (name) => {
        WriteLine(name);
        return name.Trim();
    }
};

Can be chained

var (s,i) = p.GetName() into (name) => {
    WriteLine(name);
    return name.Trim();
} into (name) => {
    var salary = p.GetSalary();
    var nameParts = name.Split(" ");
    if(nameParts.Length == 2)
        return ("Common Name",name,salary);
    if(nameParts.Length == 3)
        return ("Has Middle Name",name,salary);
    if(nameParts.Length == 1)
        return ("Single Name",name,salary);

    throw new Exception("Unsupported name format");
};

Can pass tuple and extract to argument Actually this not related, should just be #258

var (s,i) = (p.GetName(),p.GetSalary()) into ((name,salary)) => {
    WriteLine(name + " salary : " + salary);
    return (name.Trim(),salary);
};

Also can it be async too?

var s = await p.GetName() into async(name) => {
    WriteLine(name);
    return await GetStringData(name);
};
timcassell commented 3 years ago

@Thaina I do like how that reads better, but I'd prefer a pipe operator |> rather than into akin to the pipe operator proposal https://github.com/dotnet/csharplang/discussions/96. Also, how would that work for parameter-less lambdas? I don't think var s = into () => { return 42; }, or even var s = |> () => { return 42; } look good.

Tbh, I think block expressions would be better. They could even work for async by using the async keyword:

var s = await async {
    string name = p.GetName();
    WriteLine(name);
    return await GetStringData(name);
};

Although, I'm not too fond of that, as it will have the same problem of async void functions if you're using custom task-like types, except worse, because there is at least the possibility to use a custom async void method builder (of course, the OP's proposal would have that problem, too).

Thaina commented 3 years ago

For parameterless maybe var s = null into () => { return 42; };

But if we consider people using reactivex I think they would more align to have var s = null into (_) => { return 42; };

BreyerW commented 3 years ago

Maybe var s = _ into () => { return 42; }; instead

VBAndCs commented 2 years ago

We can combin syntax block and direct invocation, to define lanbda with arguments instead of params.. say:

int a=3, b= 5;
var result = (x: a, y: b) => x + y;

Still thinking about paramterless lambda invokayion.

VBAndCs commented 2 years ago

For the given sample:

var s = o switch
{
    Person p =>
        (name: p.GetName()) => {
              WriteLine(name); 
             return name.Trim();
         },
    ...
};
HaloFour commented 2 years ago

Why define a lambda at all? IIFE is a hack in Javascript to work around that language's lack of proper scoping and accessibility. Those problems don't exist in C# which eliminates the need for such a hack. Having to define arguments to flow what should otherwise just be local identifiers seems like pointless busy work with a high potential for introducing bugs and a poor way to address expression blocks.

VBAndCs commented 2 years ago

For one reason, I hope that lambdas allow yield return like vb does. Vb xml literals is smart in consuming the IEnumerable, so inline iterator lambdas in vb works well. I like the idea of code blocks but can it yirld return? Also, it can be usedull if we can applu async on lambdas, and apply attributes on them and their params, which I don't know how to apply on a code block. Please give me the code block prposal link. I perefer a code block most of the time of course, and if it can do it all, so be it. Otherwise we may need both.

timcassell commented 2 years ago

@VBAndCs #3086 should be what you're looking for. Expression blocks should completely supersede directly invoked lambdas. From LDM, it seems to be on the roadmap for C# 11, and it should be able to support flow control, including yield return and await, though that might be after a second iteration depending how much work the language team wants to put into the first pass.

VBAndCs commented 2 years ago

@timcassell I Don't like it. It seems confusing and not exacxtly what was in mind. I thought it would be exactly the lambda block but without the ()=> part, allowing the use of retutn and yield return. That proposed syntax will make code too complex, and I myeslf will avoid using it.

timcassell commented 2 years ago

If you read further down into the issue thread, lots of people expressed concerns over the OP's suggested syntax. The more popular syntax is akin to var x = ${ yield 42; };. I agree with you that the OP's syntax there is not good. Though, I'm not sure what syntax the LDM is leaning towards.

CyrusNajmabadi commented 2 years ago
int a=3, b= 5;
var result = (x: a, y: b) => x + y;

This would be easier written as:

var result = a + b;