dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.97k stars 4.03k forks source link

Add rule to require local functions be at the end of a method #53613

Open ryzngard opened 3 years ago

ryzngard commented 3 years ago

When introduced, common patterns for local functions were agreed to be at the end of a function for clarity. While this is not documented anywhere and not required by the language, there are two steps I think we can take to help here:

  1. Update https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/coding-style.md to add location of local functions as an opinionated coding style guideline
  2. Provide an analyzer/fixer to aid in development and catching these problems.
ryzngard commented 3 years ago

I'm going to do my best to get this done this weekend. At the very least an analyzer

sharwell commented 3 years ago

I believe the rule described above is more strict than necessary. We should restrict it to only disallowing the presence of a statement "lexicographically after" a local function. This is almost equivalent to "after", with the exception that something like this would be allowed:

void Method()
{
  while (x)
  {
    Console.WriteLine(y());

    string y() { } // this is last lexicographically, but not last in looping code flow
  }
}
sharwell commented 3 years ago

The other option me is just saying "last in a block". This approach is the most lenient; I would lean towards this approach until we find evidence that it is insufficient in practice.

This shows the difference:

void Method()
{
  while (x)
  {
    Console.WriteLine(y());

    string y() { }
  }

  Console.WriteLine("done"); // This is after y(), but not in the same block
}
ryzngard commented 3 years ago

@sharwell would you consider any separation of requirement for static and non-static local functions? It seems the benefit of having it "last in a block" would be for purposes of variable capture.

sharwell commented 3 years ago

would you consider any separation of requirement for static and non-static local functions

Separate rules fails the small diff test (small change of adding/removing static produces a large diff of moving an entire function), making it undesirable for code style recommendations.

ryzngard commented 3 years ago

Related https://github.com/dotnet/roslyn/issues/31140

mavasani commented 3 years ago

This is a code style rule that should go into the CodeStyle NuGet package in Roslyn as an IDExxxx analyzer: https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/language-rules

NN--- commented 3 years ago

It is possible to make a distinction between expression block functions and brace defined functions ? I find using local functions instead of lambda since they are more efficient and less cumbersome. For this use-case I would like to have them where they are defined.

Old C#

Func<int, int> f = a => a + 1;

return f(1);

Current C#:

int f(int a) => a + 1;

return f(1);

Moving f to the bottom makes code less readable.

ryzngard commented 3 years ago

It is possible to make the distinction, but I'm not sure we want to. In the simple case it could be slightly less readable, but there are definite cases where local functions could be expression blocks and still belong at the bottom. The example in documentation I think fits this.

public static int LocalFunctionFactorial(int n)
{
    return nthFactorial(n);

    int nthFactorial(int number) => number < 2 
        ? 1 
        : number * nthFactorial(number - 1);
}
ryzngard commented 3 years ago

At most, maybe a separate rule for both? That seems a little cumbersome, but does provide configuration

CyrusNajmabadi commented 3 years ago

I don't see a distinction between expression block and body block members. i'd prefer they all go at the end.

NN--- commented 3 years ago

The rule can be more strict, expression block fits in one line. My point is that replacing local lambda which is defined in the top could be done with a local function defined in the top for several cases.

CyrusNajmabadi commented 3 years ago

I would still prefer they go at the end. :)

ryzngard commented 3 years ago

The rule can be more strict, expression block fits in one line. My point is that replacing local lambda which is defined in the top could be done with a local function defined in the top for several cases.

This does break the small diff test that @sharwell pointed out, but I tend to agree that being able to rely on them being in the same place is worth it. It seems likely the only time you'd be converting a Func/Action variable to local function is to upgrade to modern practices or for some other reason. I don't know if it's common enough to warrant caring about a small diff here.

sharwell commented 3 years ago

This is a code style rule that should go into the CodeStyle NuGet package in Roslyn as an IDExxxx analyzer

This is specific to rules that have broad applicability in a known specific form. I'm not sure the semantics for this rule are clear enough at this stage to roll it out at the CodeStyle layer scale (or more specifically, I do not believe we have enough information to make a claim of best general applicability of a specific form). In the interim, Roslyn.Diagnostics.Analyzers is an appropriate place for code style rules developed specifically for use in Roslyn.sln.

tmat commented 3 years ago

It is not clear to me that declaring the local function at the end of the method adds more clarity. It doesn't in my opinion.

The function should be declared close to its usage, where it makes sense. If a function is needed in the middle of a rather large method it might make it less readable if I need to find the end of the method to understand what the function does. Essentially the same argument @NN--- made above when comparing the function to lambda. Local function should not be treated differently then lambda in terms of placement. The only difference is that it declares name.

Furthermore, when the function captures local variables moving it closer to the variables it should capture is much better since it reduces the set of variables it can capture potentially increasing the clarity by preventing unwanted captures.

@sharwell's suggestion of placing it at the end of the lexical scope is better, but I think it still adds constraint that might be detrimental to clarity.

I don't see a rule yet that can be easily automatically checked and address all possible cases, hence -1 for me on this proposal at this point.

mavasani commented 3 years ago

In the interim, Roslyn.Diagnostics.Analyzers is an appropriate place for code style rules developed specifically for use in Roslyn.sln.

Please do not do this. We have gone down this path for style rules and this has just led to double the work. If we are unsure about this rule’s final semantics, please add it as a disabled by default rule to CodeStyle package.

sharwell commented 3 years ago

Note that I'm only in favor of this rule as an addition to Roslyn.Diagnostics.Analyzers. At this time I would likely prefer to not add the rule at all than add it to the CodeStyle package. (Also, not implementing the rule at all would be an acceptable final resolution for me. I see it as potentially valuable for Roslyn but not essential.)

mavasani commented 3 years ago

Why can’t the rules be added to the experimental rules in CodeStyle package, especially if they are disabled by default: https://github.com/dotnet/roslyn/blob/ec4b2812b26291545bfac284b9898828959c0538/src/Analyzers/Core/Analyzers/IDEDiagnosticIds.cs#L174

That has already been decided as the path forward for experimental style analyzers in the design meetings.

I also want to add the fact that so many people have differing, but strong opinions on this indicates an analyzer with build time enforcement is indeed warranted. Otherwise, it leads to very bad PR experience for contributors where they are asked to follow different style in different PRs based on who is reviewing the PR. This is exactly the case that build enforcement of style analyzers provide a critical value add.

tmat commented 3 years ago

BTW, this example demonstrates my point on locality: https://github.com/dotnet/roslyn/blob/d4e8ed5bc68dd4d3bb374ee73fdaf5d80ab6e2e3/src/Features/Core/Portable/EditAndContinue/AbstractEditAndContinueAnalyzer.cs#L3177-L3182

Moving this local function to the end of the containing method or even block would be significantly worse.

ryzngard commented 3 years ago

BTW, this example demonstrates my point on locality:

I'm not sure I agree there, especially because the naming of the method is very clear. I don't think it's detrimental to move to the end of the method here, or even extract to a private static on the type. It's still valuable to me to depend on placement of local functions in the method (or block if we decide that). If context is greatly lost that might be an indicator the method itself could be broken up.

tmat commented 3 years ago

@ryzngard You don't think it's detrimental because you are human and perhaps you understand completely what the method is doing by just reading its name. But there could be a case or a detail that might be important where you want to see what the method is doing. Maybe the name only makes sense within the very local context of where the local function is defined and used. I don't see how an analyzer can determine that.

If context is greatly lost that might be an indicator the method itself could be broken up.

Maybe the method can be broken up or maybe it would be complicated to break it up. Again I don't see how an analyzer can determine that.

I don't think it's a good idea to base these decisions on a style rule. There are other considerations that are more important.

CyrusNajmabadi commented 3 years ago

I agree that moving it actually seems desirable here. It's akin to it just being a sibling helper method. Indeed, that's what i always think of local functions as. Just helper methods that would normally come after, but which we're scoping into the current method to not leak them into anything else. in that regard, i would always want them always before or always after the code (and i just prefer always after stylistically as they tend to be helpers, so they're the things i want to understand after assessing the actual code of the method itself). Being interspersed just makings this much less clear (including in this example) for me as i have to go figure out what this does and context switch in/out of the main code. I wouldn't do that as sibling methods, and i like it even less when it's now required to figure out where the main code flow continues back up.

CyrusNajmabadi commented 3 years ago

I don't think it's a good idea to base these decisions on a style rule.

That's fine. But then i will reiterate my position that i want some mechanism to separate out method code vs helper code and to know that the main method code is 'done' and i don't have to worry about things interspersed.

ryzngard commented 3 years ago

You don't think it's detrimental because you are human and perhaps you understand completely what the method is doing by just reading its name

Sure, but that's kind of the point of style rules right? They aren't supposed to change the meaning of the code, just how it's presented to humans. In most cases I've seen local function usage, I prefer them at the end as helper code. If something is a good candidate to be contextual within a certain block of code maybe that's also a reason to either conditionally suppress the warning or convert to an explicit delegate. Style rules will likely never be correct 100% of the time, they exist to have a stance that most of the time this is the preference. We as humans can always say that a rule is incorrect for our usage, we have the power over the machine at this time.

tmat commented 3 years ago

I agree that moving it actually seems desirable here. It's akin to it just being a sibling helper method. Indeed, that's what i always think of local functions as. Just helper methods that would normally come after, but which we're scoping into the current method to not leak them into anything else.

That's not what all scenarios local functions are used for though. They are not always equivalent to siblings helper methods. I don't usually care about the local functions leaking to some other methods within the type. That's a non-issue for me.

tmat commented 3 years ago

Being interspersed just makings this much less clear (including in this example) for me as i have to go figure out what this does and context switch in/out of the main code

I guess that depends on the person. It is clearer to me.

tmat commented 3 years ago

That's fine. But then i will reiterate my position that i want some mechanism to separate out method code vs helper code and to know that the main method code is 'done' and i don't have to worry about things interspersed.

If you're using local functions in the manner that you are describing I have no problem with placing such local functions at the end of the method. What I do have problem with is that this is not the only way local functions are used, so it shouldn't be forced by a style rule.

tmat commented 3 years ago

If something is a good candidate to be contextual within a certain block of code maybe that's also a reason to either conditionally suppress the warning or convert to an explicit delegate.

Conditionally suppressing the message makes the code very cluttered, which goes against the point of making code more readable. Explicit delegate is not equivalent - it changes the semantics/performance characteristics.

tmat commented 3 years ago

@CyrusNajmabadi

so they're the things i want to understand after assessing the actual code of the method itself

I don't think that's the case for all usages of local functions though. I view local functions as any other variable - I'm just naming a lambda instead of a value.

Consider the following code:

var foo = CalculateFoo();
DoStuff(foo);

Is the following better (if it was valid C# code)?

DoStuff(foo);
...
var foo = CalculateFoo();

If second is not better for an expression why is it better for local function?

Comparing local functions to lambda:

int foo() => ...;
DoStuff(foo);

is in my view the "same" as

var foo = new Func<int>(() => ...);
DoStuff(foo);

Again, I don't think this would be better in all cases (in those cases where I want the reader to understand what foo is doing before reading the rest of the code):

DoStuff(foo);

var foo = new Func<int>(() => ...);
ryzngard commented 3 years ago

Explicit delegate is not equivalent - it changes the semantics/performance characteristics.

Is that true with the use of static lambdas?

tmat commented 3 years ago

Is that true with the use of static lambdas?

Yes. They don't allocate a delegate instance if they are not converted into delegate.

tmat commented 3 years ago

Essentially, all this boils down to the significance the that author of the code is giving to the local function. As an author my intention can either be: 1) It's a helper function, the reader doesn't need to understand what it is doing at this point 2) It's important or useful to know what the function is doing at this point (or what local variables it has access to)

[1] Declare your local function at the end of the method, or make it a sibling method. Either works (for static local functions). It's more problematic for those that capture variables. [2] Declare it before it's used or at the place where it has access to variables it is intended to capture.

Let's leave it up to the author of the code which one to use.

mavasani commented 3 years ago

I feel the discussion in this issue is digressing from whether or not we need an analyzer for this to whether or not personal preference is towards having an analyzer and what should be the enforcement kind, if at all. As stated earlier, I want to emphasise that this needs a code style analyzer due to following reasons:

  1. Placement of local function is a code style aspect.
  2. We have at least a few set of developers who prefer a specific style, and would like enforcement if they were owning the repo.
  3. Roslyn is probably not a good candidate for dogfooding this rule as there is a set of developers working on it who prefer no enforcement and would prefer a free form approach for it. However , that does not justify not implementing this as a style analyzer for other repos. For example, I would be very happy to turn this analyzer on in Roslyn-analyzers repo, regardless of the final chosen style to enforce.
tmat commented 3 years ago

@mavasani Makes sense - this issue, in its description conflates these two aspects. [1] in the description is what you are saying, so perhaps file another issue to track just that?

ryzngard commented 3 years ago

The proposal of the issue was intended to enable this as part of our code guidelines, but I'm happy to start with adding the analyzer and not enabling in Roslyn. There are still style questions for the Roslyn repo though. Should we continue that discussion here or file a different issue?

sharwell commented 3 years ago

Placement of local function is a code style aspect.

Placement of local functions is a code style aspect for Roslyn.sln. It might eventually apply in a broader sense, but we don't have a style consensus from Compilers/LDM/Runtime to make this a code style for all projects, even in experimental form.

Also note that there is an extremely long delay (months) between a feature getting implemented in CodeStyle and its use in Roslyn.sln (we must wait for the feature to appear in a subsequent public release). Roslyn.Diagnostics.Analyzers is the mechanism we use for iteration/rapid dogfood.

galakt commented 6 months ago

@sharwell @ryzngard Could you please summarize this ticket? Will code style rule be implemented? At what place? Do we need review from someone? Do you have timeline? Do you need any help (from community or volunteers)?

ryzngard commented 6 months ago

@sharwell @ryzngard Could you please summarize this ticket? Will code style rule be implemented? At what place? Do we need review from someone? Do you have timeline? Do you need any help (from community or volunteers)?

This has no current momentum. @sharwell did this ever go through design review? I'm assuming that's what is blocking this from being up for grabs.

sharwell commented 6 months ago

I don't think it did