dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.04k stars 4.03k forks source link

C# compiler hangs in V17.7.2 but works in V17.6.4 #69697

Open vsfeedback opened 1 year ago

vsfeedback commented 1 year ago

This issue has been moved from a ticket on Developer Community.


[severity:I'm unable to use this version] [regression] [worked-in:17.6.4] When compiling the project "DoesntCompile" the C# compiler goes into an infinite loop using CPU time.

If some of the clauses in the expression in DoesntCompileMethod are removed to simplify it, it then compiles

I've attached the source code in compilerbug.zip

I've reproduced this on two different machines


Original Comments

Feedback Bot on 8/24/2023, 01:02 AM:

(private comment, text removed)


Original Solutions

(no solutions)

sharwell commented 1 year ago

@CyrusNajmabadi The C# parser appears to be struggling in LanguageParser.IsPossibleLambdaExpression(Precedence) with the following code:

namespace DoesntCompile
{
    public class DoesntCompileClass
    {
        public List<I> A { get; set; }

        public void DoesntCompileMethod()
        {
         var x =A[0].B?.C?.D == 0 &&
                A[0].B?.C?[0].D.E == F.G &&
                A[0].B?.C?[1].D.E == F.G &&
                A[0].B?.C?[2].D.E == F.G &&
                A[0].B?.C?[0].D.E == 1 &&
                A[0].B?.C?[1].D.E == 2 &&
                A[0].B?.C?[2].D.E == 3 &&
                A[0].B?.C?[0].D.E?.F.G == 4 &&
                A[0].B?.C?[1].D.E?.F.G == 5 &&
                A[0].B?.C?[2].D.E?.F.G == 6 &&
                A[0].B?.C?[0].D.E?.F == G.H &&
                A[0].B?.C?[0].D.E?.F?.G == 7 &&
                A[0].B?.C?[0].D.E?.F?[0].G.H == 8 &&
                A[0].B?.C?[0].D.E?.F?[1].G.H == 9 &&
                A[0].B?.C?[1].D.E?.F == G.H &&
                A[0].B?.C?[1].D.E?.F?.G == 10 &&
                A[0].B?.C?[1].D.E?.F?[0].G.H == 11 &&
                A[0].B?.C?[2].D.E?.F == G.H &&
                A[0].B?.C?[2].D.E?.F?.G == 12 &&
                A[0].B?.C?[2].D.E?.F?[0].G.H == 13;
        }
    }
}
CyrusNajmabadi commented 1 year ago

It looks like thsi got slower with the work to distinguish a?[b] as a conditional access expression vs a ? [b] : [c] as a conditional expression with collection expressions. Note: i do believe this terminates, it's just that each successive expression causes speculation time to double.

will see if i can do something to help terminate the speculation early.

CyrusNajmabadi commented 1 year ago

That said, this is a highly pathlogical pattern, which can be trivially avoided by breaking up the expression. So we could just accept this and say: don't write code like that.

Similar to nested lambdas, sometimes compile times are just long.

CyrusNajmabadi commented 1 year ago

Note: this does complete, it just takes 2.6 minutes. Looking to see if there are mitigations.

CyrusNajmabadi commented 1 year ago

So here's the challenge. When we have:

         var x =A[0].B?.C?.D == 0 &&
                A[0].B?.C?[0]

At this point we're not sure if [0] is a conditional index, or if it's some 'true' expression for a ? : expression that starts with a collection expression. So we try to see if its the latter, and if we then see a : which would mean we want the latter interpretation. However, in this case, the entire remainder of the the expression (all the way to the == 13; part can be consumed as a continuation. For example, if it was followed by a : then that would change the interpretation of the parse.

Unfortunately, this is n^2 because we we do each speculation on ?[, we continue to hit nested cases of that which require similar lookahead to see which cases the nested ones are in, and so on and so forth.

One option we have if we actually felt this was important to fix would be some sort of cache of evaluated position to "what was our conclusion on this the last time we hit this". However, that's a lot of complexity that i would prefer to avoid if possible.

CyrusNajmabadi commented 1 year ago

tagging @jaredpar for thoughts here on next steps. The above construct does demonstrate a n^2 issue with speculative parsing. However, i also think it's just pathological, and can be trivially worked around by simply parenthesizing pieces as appropriate. As this does not look like realistic user code, and because it needs to be so complex to get to the point where the perf is very bad, my recommendation is that we do not make any changes here.

sharwell commented 1 year ago

@CyrusNajmabadi Much of the overhead in local performance testing was creating and storing diagnostics when the parser created missing tokens. If we can avoid creating the missing tokens/diagnostics during speculative lookahead, it might resolve this issue.

CyrusNajmabadi commented 1 year ago

@sharwell thanks. i'll look into that.

CyrusNajmabadi commented 1 year ago

Even avoiding the issue that causes diagnostics, the performance is still bad unfortunately.

paulhickman-a365 commented 1 year ago

It is real code - it was in an Assert.IsTrue() to validate a data structure had the expected properties in a unit test. However, it is easy to split into many assertions.

huoyaoyuan commented 1 year ago

I remember back in C# 6, delegate?() was rejected for performance of the same pattern.