dotnet / csharplang

The official repo for the design of the C# programming language
11.38k stars 1.02k forks source link

Champion "static local functions" (16.3, Core 3) #1565

Open jcouv opened 6 years ago

jcouv commented 6 years ago

The proposal is to declare a local function as non-capturing. The keyword to indicate that would be static.

Note: if we do this, then we should also consider allowing parameters and locals of the local function to shadow parameters and locals from the containing method.

Related discussion thread, including a discussion of "capture lists" (a more elaborate feature): https://github.com/dotnet/csharplang/issues/1277

Note: EE scenarios are a concern.

Do we want to allow some other modifiers (like unsafe)?

LDM history:

bondsbw commented 6 years ago

To be clear, would this fit @benaadams's definition of static?

A static method in a class hides its enclosing scope; you can't access and of its instance data or call any of its instance methods - only call static functions and access static data.

Likewise a static local function shouldn't have access the the parent functions local state or the classes instance state, much like a regular static function.

The static local function is providing scoping so as not to pollute the class with a static function which is only called in a single method.

jcouv commented 6 years ago

@bondsbw Yes, that's what I'd have in mind. The local function would not be able to access variables from parent method or instance fields. It could only access variables that it declares (parameters or locals) and static fields.

HaloFour commented 6 years ago

Is there a big reason to do this for local functions, aside avoiding accidental mutations? Aren't local functions significantly more efficient with captures?

jcouv commented 6 years ago

@HaloFour The primary motivation for this proposal is readability/maintainability. Knowing that some code is isolated makes it easier to understand, review and re-use. That's the same general argument as with any kind of isolation like object encapsulation or functional programming.

There is also a practical motivation. There are good reasons why parameters or locals of local functions should not shadow variables from the enclosing scope. But in practice, there are many pieces of pure logic that make sense to extract as local function, but where this restriction becomes inconvenient (prevents you from using obvious variable names). Having static local functions and additionally allowing static local functions to declare their own variables regardless of context solves that problem (while avoiding the pitfalls of shadowing).

popcatalin81 commented 6 years ago

What’s the Benefit over a private static function? Local functions benefit is exactly capturing

orthoxerox commented 6 years ago

@popcatalin81 more fine-grained access control, primarily

mgravell commented 6 years ago

@HaloFour it isn't just mutations; it is also to incredibly useful to avoid accidentally creating the capture machinery, especially in hot paths. This can have a surprising amount of overhead, especially in scenarios like async. Again, on hot paths, a common pattern in some async pipes is to try to avoid the async machinery completely if it will often be synchronous, falling back into an async completion if it turns out to be incomplete - which means all the local state (including the capture machinery) is going to end up on the heap (because of how await works when tasks are incomplete); doing this manually can have significant savings for performance critical code, and it turns out that local functions as long as you don't capture are a very tidy way of doing it

@popcatalin81 scoping and expressiveness. I actually strongly disagree with you that the benefit of local functions is "exactly capturing" - I've written plenty of local functions, and the ones that make use of capture are by far the minority. Well, let me rephrase: the ones where I intentionally used capture are the minority. But yes, functionally they're just like regular private static methods; in fact, my usual "did I capture without realising it?" check is exactly to move the local function out of the method (so: a private static), see if the code still compiles, then move it back (and remove the modifiers).

HaloFour commented 6 years ago

@mgravell

If the goal is to avoid creating async machinery for capturing then I don't see why it would be limited to avoiding capture on local functions where said machinery requires significantly less overhead.

And, like most proposals that involve limiting functionality/syntax for the purpose of improving code generation and performance, why is this not suitably managed by an analyzer? Most people will not care about this degree of optimization.

I've never reached for a local function unless I wanted to capture. I see zero point to them otherwise.

HaloFour commented 6 years ago

@jcouv

Knowing that some code is isolated makes it easier to understand, review and re-use. Having static local functions and additionally allowing static local functions to declare their own variables regardless of context solves that problem (while avoiding the pitfalls of shadowing).

IMO, those two statements are in direct contradiction. Allowing a local function to shadow variables would negatively impact the readability of said function. Now anytime you see a local function you have to double-back to the signature to see what that identifier refers to.

jcouv commented 6 years ago

@HaloFour Any design involves some trade-offs and tensions. I recognize it is a fair concern to discuss and experiment with. From experience with static methods, I've found them to reduce the cognitive load rather than increase it, despite their introducing a "mode". When reading a static local function with variables defined independently of the parent scope, I suspect the same modality will kick in so it won't be confusing.

HaloFour commented 6 years ago

@jcouv

Any design involves some trade-offs and tensions. I recognize it is a fair concern to discuss and experiment with.

I agree. Consider this just part of that discussion. 😁 Despite my arguments I'm fairly ambivalent about the proposal.

Where this feature tries to address performance of certain aspects of generated code, particularly around async method machinery, it feels like this could be easily (and appropriately) addressed via analyzers. By itself this feature seems like it would barely scratch that surface. Local function capture, by itself, is not the performance problem. It's only in conjunction with those other features like async methods or delegates where the compiler resorts to the less efficient machinery. So why target a language change that barely scratches the actual problem?

I'm much more opposed to the notion of the shadowing. I agree that having to come up with a lot of different names for identifiers can be restrictive. The team was very quick to dismiss that argument in the context of leaky out declaration variables and pattern variables.

From experience with static methods, I've found them to reduce the cognitive load rather than increase it, despite their introducing a "mode".

Current static methods don't have to deal with other locals being potentially in scope, so I don't feel that this comparison is appropriate. The rules regarding shadowing fields don't change between instance or static methods.

alrz commented 6 years ago

@HaloFour

Current static methods don't have to deal with other locals being potentially in scope

You can conceptually map captures to instance fields,

class C
{
   int name = 0;
   void Method1() => Use(name); // OK
   static void Method2() => Use(name); // Error
}

void M()
{
   int name = 0;
   void Local1() => Use(name); // OK
   static void Local2() => Use(name); // Error
}

Also, methods can shadow fields with parameters and locals regardless of being static.

jnm2 commented 6 years ago

@HaloFour To me, shadowing is a feature. I occasionally shadow deliberately in order to prevent uncontrolled access to a field; var controlledAccess = Interlocked.Exchange(ref this.controlledAccess, null); for example.

I've long been wishing I could declare local function parameters that shadow locals that would otherwise have been captured, in order to prevent their capture and accidental use out-of-band, while still capturing others or simply while declaring a fully-static helper method closer to where it is relevant.

jaredpar commented 6 years ago

@HaloFour

And, like most proposals that involve limiting functionality/syntax for the purpose of improving code generation and performance, why is this not suitably managed by an analyzer?

For me this has nothing to do with improving code generation, it's all about improving code readability.

Local functions are making it much more common to have top level functions which are several hundred lines long. The logic gets split between the top level method body and a number of local functions whose behavior is specific to the top level method. Before local functions such code would be split into a number of top level method bodies.and the helper functions would be needlessly polluting the containing types scope. Now the entirety of the logic can be contained with in the top level method body.

Such local functions are harder to review than they would be as a top level function. The silent capture of state can affect how they execute and must be accounted for. There is nothing inherently wrong with this and many times it's hugely beneficial to the developer. In the cases where no capture is done, or when the developer specifically wants to avoid it, the static modifier helps significantly with readability. It tells me as a reviewer that I can review this local function in isolation. No need to scan for subtle capture problems.

For small functions this really isn't that important. But in larger methods it can make a huge difference.

It does have the additional benefit of ensuring local functions are capture free and hence will be a tiny bit more efficient as we don't even have to pass a struct closure around and have local indirection. If that were the primary motivation of this feature I'd be against it. Not worth the cost IMHO.

I'm much more opposed to the notion of the shadowing.

I'm still a bit on the fence about this. In the absence of a static modifier I'd be 100% against shadowing. It creates too much opportunity subtle behavior changes. Imagine you delete a parameter that shadows a local in the top level method, oops now you just introduced a capture you probably didn't intend.

Restricting shadowing to local functions marked as static though removes this problem and most of the issues around shadowing. It also makes local functions significantly more usable. I didn't really appreciate this problem until I started using local functions a lot in C#.

HaloFour commented 6 years ago

@alrz

You can conceptually map captures to instance fields

I don't think that's the intuition that most people have regarding captured locals.

Also, methods can shadow fields with parameters and locals regardless of being static.

Yes, but never without a way to explicitly refer to that shadowed field, and you're always required to explicitly qualify the field anytime such a local is in scope.

@jaredpar

Local functions are making it much more common to have top level functions which are several hundred lines long.

Sounds like a good argument for removing local functions from the language. 😉

But seriously, cramming hundreds of lines into a single method definition doesn't sound like a good idea, with or without local functions. In my opinion that's exactly when those chunks of logic should be top-level functions, specifically so that they can maintained and tested in isolation from one another.

I can't fathom a local function that's more than maybe 4-5 LOC, and I would count that against the LOC of the function in which its contained.

Such local functions are harder to review than they would be as a top level function.

Which is probably why they shouldn't be local functions, regardless of whether they capture. Locality isn't an excuse to hide a massive ball of logic behind a blackbox.

Restricting shadowing to local functions marked as static though removes this problem and most of the issues around shadowing.

The technical problems, maybe. But I still see the problem of having to mentally parse through the contents of the local function and upon seeing any given identifier having to double-back to the signature to see whether the lexical scoping has been reset. IPU forbid having to deal with that when nesting local functions (which I hope ain't becoming a thing.)

It's not that a might refer to multiple things, it's that you can't look at a single LOC within the body of a top-level function and have any idea what a is.

jaredpar commented 6 years ago

@HaloFour

But seriously, cramming hundreds of lines into a single method definition doesn't sound like a good idea, with or without local functions

I used to believe the same. Then I spent a bunch of time coding in F# which has local functions and gradually I opened up to the advantages it provides. There is value in keeping all the logic necessary for an operation located together.

Locality isn't an excuse to hide a massive ball of logic behind a blackbox.

The reverse is polluting a type with a number of methods that are either a) useful to a single method or worse b) only legal to be called from a specific method. Over times types grow and local functions help avoid polluting types with a number of members that simply aren't generally useful. When I type . into the editor I want to see the members that are useful to me, not some one time helper that I should never be calling.

It's not that a might refer to multiple things, it's that you can't look at a single LOC within the body of a top-level function and have any idea what a is.

That is true today without local functions. In addition to a parameter or local the value a may refer to instance field, instance member, static field, or static member all of which can come from some combination of the type hierarchy, containing types or members brought in by a using static directive. Lookup is anything but simple.

dangi12012 commented 6 years ago

This has MANY speed advantages. We could declare a static lambda like this:

Object slow = "Any Object Here"

Action f = () => Console.WriteLine(slow); //obj is shadowed and slow in critical paths

Action f = static () => Console.WriteLine(slow); //Error slow not declared
Action f = static (x) => Console.WriteLine(x); //We dont capture, we pass as argument, which does not need to to have the exact same reference, and we dont capture more than we need.
HaloFour commented 6 years ago

@dangi12012

The proposal above doesn't touch on lambdas, although they are probably worth bringing up. You really wouldn't be able to avoid the allocation of the delegate itself, particularly if the delegate instance escaped the method in which it was declared.

jaredpar commented 6 years ago

@HaloFour

This has MANY speed advantages.

Only for lambdas which this proposal doesn't currently cover. It's a natural extension to consider but the focus for now is on local functions. For local functions there really isn't a significant speed up static vs. non static in the general case.

popcatalin81 commented 6 years ago

I think this thread presents a better case for improved capturing mechanisms of local functions rather than static modifier.

Example: instead of capturing in a compiler generated class the following could be viable

This does not give perfect control to those who whish it, but makes refactoring a lot easier compared to static non static local function.

As a second benefit, even local functions that are not marked static will benefit performance wise, by eliminating captures when possible.

tannergooding commented 6 years ago

Static delegates are semi-related (both to the static lambda discussion above and to the static local functions here): https://github.com/dotnet/csharplang/blob/master/proposals/static-delegates.md

JamesNK commented 6 years ago

What about when you want to disable capturing but still have access to instance fields?

mgravell commented 6 years ago

@JamesNK I can see the merit of a "no capture, but instance" scenario... You're right in that this aspect of the proposal is being influenced - perhaps incorrectly - by the tempting convenience of that static keyword. Maybe that is the wrong keyword. But I'm at a loss for a better one.

Thinking pragmatically, perhaps static with the proposed semantic is fine - if you need it, just pass in this explicitly as an argument? Since a local function can't be overridden (polymorphism), virt-call seems an unnecessary step anyway.

svick commented 6 years ago

@popcatalin81

If a local function is not assigned to a delegate, a non-allocating capturing mechanism is already used. Though it's not done the way you suggest, instead, a struct with the state is created and passed to the local function by ref.

dangi12012 commented 6 years ago

@svick If it is implemented the way you say then there really is no need for a static Delegate. The only advantage would be that you cant create a heavyweight captured class with static by accident.

iam3yal commented 6 years ago

@mgravell

I can see the merit of a "no capture, but instance" scenario... You're right in that this aspect of the proposal is being influenced - perhaps incorrectly - by the tempting convenience of that static keyword. Maybe that is the wrong keyword. But I'm at a loss for a better one.

Let's just go through few of the options:

  1. Disallow this and pass it explicitly when you need it as noted in your post.
  2. Allow this and lose the ability to decide whether you need it but beyond that it might also confuse people.
  3. Introduce the concept of a capture list which was discussed and iirc was rejected.
  4. Use something like attributes but I think that it goes against the design of the language.
  5. Introduce a new keyword into the language but as far as I can tell people are struggling to come up with something that makes sense.

Looking at the options the first one looks like the most sensible approach to take.

JamesNK commented 6 years ago

What about when you want to disable capturing but still have access to instance fields?

So here is a real world example of where no capturing + shadowing of variables would be useful, while still having access to instance fields is required.

https://github.com/JamesNK/Newtonsoft.Json/blob/ab67fd558f7f4cbfb43550e1379931fe54f7c183/Src/Newtonsoft.Json/Linq/JObject.Async.cs#L64-L74

iam3yal commented 6 years ago

@JamesNK Good and all but what's the issue with something like this?

static async Task AwaitProperties(JPropertyKeyedCollection properties, Task task, int i, JsonWriter writer, CancellationToken cancellationToken, JsonConverter[] converters)
{
    // Do something with properties
}

AwaitProperties(_properties, ...)
mgravell commented 6 years ago

@eyalsk totally agree - the simplicity and obviousness of "1" (which is the original suggestion in the cross-reference issue) is a major selling point compared to all other options. Not perfect in every conceivably way, but a good balance of utility and simplicity.

popcatalin81 commented 6 years ago

I have two questions related to this issue:

1) What happens if IL generation / Jit is improved in the future and static local functions reasoning might go away? will this feature be left and then documented in the language as being there for historical reasons?

2) And (I hate this question deeply, but I'm going to ask it for clarification reasons) Why is an analyzer insufficient for the scenarios listed as arguments for static local functions?

iam3yal commented 6 years ago

@popcatalin81 I really think you're missing the point of this feature.

  1. I don't know what kind of improvements you expect to have and not sure why it's relevant but this should answer your question.

  2. Can you shadow a parameter with an analyzer? can an analyzer express your intention? can an analyzer help with scoping? no.

popcatalin81 commented 6 years ago

@eyalsk I'm not missing anything, this is a "nice" feature for a class of problems (no doubt here). What I'm debating is whether a "nice" feature is "enough" to add it to the language.

While I personally I'm, highly biased towards adding new features, I think it's important to not forget: Hard part of language design

And besides the niceties of the feature I want to debate the following:

I would very much prefer the C# compiler to make both of the following functions just as efficient using compiler and runtime optimizations, (as opposed to using static modifier and limiting possible optimisations for all potential local functions).

void FooBar(int a) {
    void Foo(int val) {
         Frob(val);
    }
    void Bar() {
        Frob(a);
    }
    // ...
    Foo(a)
    Bar(); // Should be just as effcient with some compiler smarts
}
iam3yal commented 6 years ago

@popcatalin81

I'm not missing anything, this is a "nice" feature for a class of problems (no doubt here). What I'm debating is whether a "nice" feature is "enough" to add it to the language.

Well, I guess we'll see what the design team have to say about it.

While I personally I'm, highly biased towards adding new features, I think it's important to not forget: Hard part of language design

I agree but this concerns any language and any feature, however, you previously stated the following:

What happens if IL generation / Jit is improved in the future and static local functions reasoning might go away?

And I asked you what kinds of improvements are you speaking about? because as far as I can tell they are orthogonal to each other.

Will it make refactoring harder? I say yes, somewhat, changing semantics between static local and non-static will involve parameter adding and removing gymnastics.

I don't know what "harder" means and I'm not sure it's relevant to the discussion because today you might even go as far as promoting local functions to a member of the class or a add it to some "utility" class so I'm not sure how this makes refactoring any "harder".

Is it going to be confusing? Yes, x% of developers don't care about the performance semantics implications all the way to this level.

That's subjective but to me and to few others as far as I can tell, it's really more of improving the readability of the code than performance gains.

Does it solve a problem which is unsolved by the language? No, it's just a static function.

Yes it does and it was discussed here before so I'm not sure why you deliberately ignore the discussion but more importantly the facts, I guess it has to do with being "biased". ;)

Is it future proof? No, Any future improvements to the async state machine or runtime Jit improvements might render the need for (non capturing) local functions absolete.

Again, you're going back to performance which is pretty odd because in the proposal itself they write this:

The proposal is to declare a local function as non-capturing. The keyword to indicate that would be static.

Note: if we do this, then we should also consider allowing parameters and locals of the local function to shadow parameters and locals from the containing method.

Did they mention the word performance? if anything, performance improvements might just be a side-effect.

p.s. Not sure if my tone is a bit off putting in this post so sorry if it is, was not my intention.

jaredpar commented 6 years ago

@popcatalin81

I would very much prefer the C# compiler to make both of the following functions just as efficient using compiler and runtime optimizations

Efficiency is probably the least interesting aspect of this proposal to me. Local functions are already very efficient as they don't generally require allocations. There is a small indirection for captured state but that's no different than accessing a field from this inside an instance method.

The advantage to static local functions is everywhere else:

Will it make refactoring harder? I say yes, somewhat, changing semantics between static local and non-static will involve parameter adding and removing gymnastics.

How is refactoring a static local function any harder than a normal local function?

Does it solve a problem which is unsolved by the language?

See above. It opens up the door to other language improvements.

No, Any future improvements to the async state machine or runtime Jit improvements might render the need for (non capturing) local functions absolete

Disagree strongly. As stated above, non-capturing functions provide a code readability benefit that will survive no matter what the runtime underneath does.

popcatalin81 commented 6 years ago

Ok I'm convinced, I can definitely see the value of something like the following:

void Foo(Bar bar, Baz baz)
{
     // Marshaling logic here

     [DllImport("foobar")]
     static void NativeFoo(IntPtr barPtr, IntPtr bazPtr);
}

This is really good for code organisation, just like local functions are for iterator methods for example. Also a good use for intrisics, again for code organisational purposes.

My standign remark is if this is used to improve the async state machine output (which it can). I think this is the wrong solution to that problem. The C# compiler should be able to generate efficient async code with or withouth a local metod being marked as static. Also marking a method static to enforce a behavior, in this case not capturing, is fixing the issue from the wrong angle. The right angle would be to actually improve the async state machine code generation to the point where state is never captured unnecesarily.

gafter commented 6 years ago

A program with a static modifier on a local function would behave identically to the same program without the static modifier.

jaredpar commented 6 years ago

@gafter

A program with a static modifier on a local function would behave identically to the same program without the static modifier.

That is not guaranteed by the language. Consider the following code:

class C { 
  static Action Go() { 
    void method() { Console.WriteLine("hello"); } 
    // other code 
    return method;
  }

Whether or not subsequent calls to Go produce equivalent delegates is undefined without static but concretely defined with static. The reasoning is that the language specifically does not specify how local functions are emitted to metadata. The compiler can, and usually will, emit them as statics in release but that's not guaranteed by the language. It's perfectly legal for the compiler to choose to emit them as instance members and say attach them to a closure created by the "other code". In that case subsequent calls to Go would produce delegates that were not equatable.

For static local functions the intent is to guarantee they are emitted as static members hence this behavior is concretely defined.

EDIT cleaned up the example

MgSam commented 5 years ago

Related Design Notes: https://github.com/dotnet/csharplang/blob/master/meetings/2018/LDM-2018-09-10.md

mgravell commented 5 years ago

Seems to be available in VS 16 preview 2, using <LangVersion>8.0</LangVersion>. Great work folks, thanks!

Happypig375 commented 5 years ago

Secret feature?? 🤤 Not mentioned in the announcements at all!

TKharaishvili commented 5 years ago

If you guys end up adding this feature to the language, maybe local extension methods are worth considering too? It's not a must have thing but it does sound like a nice to have thing.

The syntax would be similar to the classic extension methods:

bool CanPairUp(Token t1, Token t2)
{
    if (t1.IsKeywordOrIdentifier() && t2.IsKeywordOrIdentifier())
    {
        return false;
    }
    return true;

    bool IsKeywordOrIdentifier(this Token t)
    {
        return t.Kind == TokenKind.Keyword || t.kind == TokenKind.Identifier;
    }
}
jaredpar commented 5 years ago

@TKharaishvili local extension method sis something we've considered but at this point don't have firm plans for. I think we'd need to see compelling scenarios to add the complexity to member lookup here.

Extension methods solve a couple of problems:

  1. Adding behavior to a type that's not a part of it's declaration
  2. Making the behavior discoverable by allowing the behavior to have member level syntax

In the case of local functions there is no discovery issue. The behavior is right there in the same method that you are editing. Can't even use partial tricks to put it into a different file. Hence it would seem to not have as much value as normal extension methods.

TKharaishvili commented 5 years ago

@jaredpar

Adding behavior to a type that's not a part of it's declaration

The benefits of this being method chaining and the fact that it reads better. But as I said, this is not a must have. So if you guys simply don't have time for this that's totally cool with me.

NetMage commented 5 years ago

It seems like implementing this undoes most of the arguments against static local variables.

HaloFour commented 5 years ago

@NetMage

It seems like implementing this undoes most of the arguments against static local variables.

Could you clarify? Other than sharing part of the name static local functions aren't at all related to static locals.

juliusfriedman commented 5 years ago

No... but it's rather confusing at first glance until you think about how a regular static method is invoked and then you realize it doesn't matter, it just saves someone else from outside the function being able to easily call it from outside the scope and thus wouldn't pollute intellisense, as well as give you access to statics...

333fred commented 4 years ago

@zspitz if local extensions are something you want, I'd suggest commenting on that issue: https://github.com/dotnet/csharplang/issues/850. This is for a different feature.

zspitz commented 4 years ago

@333fred

This is for a different feature.

It seems to me to be an extension of local static methods, much as extension methods are in themselves an extension of methods.

I'd suggest commenting on that issue.

I've done so. Thanks.

Thaina commented 1 month ago

Are there any discussions about local property? Is it also included in local function?

So we can define getter and setter with one shared signature in the function

333fred commented 1 month ago

Iirc there have been other discussions about more local things, like types and properties. They haven't gone anywhere.