dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.7k stars 3.98k forks source link

.NET 9 `Preview.5`, `Preview.6`, `Preview.7` C# compiler crashes on LINQ expression with string interpolation #74163

Open SergeiPavlov opened 1 week ago

SergeiPavlov commented 1 week ago

Version Used: .NET 9 SDK 9.0.100-preview.5.24307.3 tried also Preview.6 and today's Preview.7

Steps to Reproduce: Following code:

using System.Linq;

class Program
{
    static IQueryable<string> GetStrings() => null;

    public static void Main() =>
        GetStrings().Select(o => $"{o} {o} {o} {o}");     // Crashes with 4 or more `{o}`
}

with .csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <LangVersion>Preview</LangVersion>
    <TargetFramework>net9.0</TargetFramework>
  </PropertyGroup>
</Project>

Diagnostic Id:

C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.RuntimeIdentifierInference.targets(314,5): message NETSDK1057: You are using a preview version of .NET. See: https://aka.ms/dotnet-support-policy [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error : Process terminated. System.InvalidOperationException: Unexpected value 'Sequence' of type 'Microsoft.CodeAnalysis.CSharp.BoundKind' [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitExpressionWithoutStackGuard(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitInternal(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.Visit(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.Expressions(ImmutableArray`1 expressions) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitCall(BoundCall node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitInternal(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.Visit(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitConversion(BoundConversion node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitInternal(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.Visit(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.TranslateLambdaBody(BoundBlock block) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.VisitLambdaInternal(BoundLambda node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ExpressionLambdaRewriter.RewriteLambda(BoundLambda node, TypeCompilationState compilationState, TypeMap typeMap, Int32 recursionDepth, BindingDiagnosticBag diagnostics) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ClosureConversion.RewriteLambdaConversion(BoundLambda node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.ClosureConversion.VisitConversion(BoundConversion conversion) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithoutStackGuard(BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.BoundTreeVisitor.VisitExpressionWithStackGuard(Int32& recursionDepth, BoundExpression node) [D:\WORK\CS\CS.csproj]
C:\Program Files\dotnet\sdk\9.0.100-preview.5.24307.3\Roslyn\Microsoft.CSharp.Core.targets(85,5): error :    at Microsoft.CodeAnalysis.CSharp.BoundTreeRewriter.DoVisitList[T](ImmutableArray`1 list) [D:\WORK\CS\CS.csproj]
...

Expected Behavior: Successful build

Actual Behavior: Compiler crash (does work without <LangVersion>Preview</LangVersion> setting)

bernd5 commented 1 week ago

The string interpolation is handled internally as collection-expression conversion to a readonly span (a debug assert is also triggered). This is okay only in non expression trees.

Currently: System.String! System.String.Format(System.String! format, params System.ReadOnlySpan<System.Object?> args) is selected to handle the string interpolation. The reason is because the code is essentially lowered to:

using System.Collections.Generic;
using System.Linq;

class Program
{
    static IQueryable<string> GetStrings() => null;

    public static void Main() =>
        GetStrings().Select(o => string.Format("{0} {1} {2} {3}", o, o, o, o));
}

For expression trees we could lower it to (no ref struct allowed):

using System.Collections.Generic;
using System.Linq;

class Program
{
    static IQueryable<string> GetStrings() => null;

    public static void Main() =>
        GetStrings().Select(o => string.Format("{0} {1} {2} {3}", new object[] { o, o, o, o }));
}

But it would be maybe better to handle it in the invocation binding.

AlekseyTs commented 1 week ago

It sounds like a usage of a non-array params collection is making its way to an expression tree without triggering an error. DiagnosticsPass is supposed to detect the usage and report an error like:

                // (8,46): error CS9226: An expression tree may not contain an expanded form of non-array params collection parameter.
                //         Expression<System.Action> e1 = () => Test();
                Diagnostic(ErrorCode.ERR_ParamsCollectionExpressionTree, "Test()").WithLocation(8, 46),

The following method is supposed to detect the situation:

        public override BoundNode VisitCollectionExpression(BoundCollectionExpression node)
        {
            if (_inExpressionLambda)
            {
                Error(
                    node.IsParamsArrayOrCollection ?
                        ErrorCode.ERR_ParamsCollectionExpressionTree :
                        ErrorCode.ERR_ExpressionTreeContainsCollectionExpression,
                    node);
            }

            return base.VisitCollectionExpression(node);
        }

At the moment there is no clarity how exactly a collection expression is getting through. Understanding that should be the next step in instigation of this issue.

AlekseyTs commented 6 days ago

@bernd5

The string interpolation is handled internally as collection-expression conversion to a readonly span (a debug assert is also triggered).

Could you please provide more details?

The string interpolation is handled internally as collection-expression conversion to a readonly span

Where exactly this handling is happening?

a debug assert is also triggered

What does assert check? Where it is in the code? What the call stack looks like?

bernd5 commented 6 days ago

The collection-conversion comes to play at (here we lower and assume overload resolution to pick the right one): https://github.com/dotnet/roslyn/blob/3a3c108d86c3cbb00049bfc4f5f9a04a42fa4b1b/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringInterpolation.cs#L354

This is great because we have multiple overloads, with and without params. Overload resolution chooses the best one. And this allows the same code for less than 4 parameters, too.

The problem arises now for the params overload because we have 2 of them and the span overload is better.


What does assert check? Where it is in the code? What the call stack looks like?

It is here https://github.com/dotnet/roslyn/blob/3a3c108d86c3cbb00049bfc4f5f9a04a42fa4b1b/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_CollectionExpression.cs#L34

The callstack is:

>   Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.RewriteCollectionExpressionConversion(Microsoft.CodeAnalysis.CSharp.Conversion conversion, Microsoft.CodeAnalysis.CSharp.BoundCollectionExpression node) Zeile 34 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitConversion(Microsoft.CodeAnalysis.CSharp.BoundConversion node) Zeile 66  C#
    [Externer Code] 
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithoutStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 97   C#
    [Externer Code] 
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 92  C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpressionImpl(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 273 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpression(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 230 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitArgumentsAndCaptureReceiverIfNeeded(ref Microsoft.CodeAnalysis.CSharp.BoundExpression rewrittenReceiver, Microsoft.CodeAnalysis.CSharp.LocalRewriter.ReceiverCaptureMode captureReceiverMode, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.CSharp.BoundExpression> arguments, Microsoft.CodeAnalysis.CSharp.Symbol methodOrIndexer, System.Collections.Immutable.ImmutableArray<int> argsToParamsOpt, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.RefKind> argumentRefKindsOpt, Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder<Microsoft.CodeAnalysis.CSharp.BoundExpression> storesOpt, ref Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder<Microsoft.CodeAnalysis.CSharp.Symbols.LocalSymbol> tempsOpt, Microsoft.CodeAnalysis.CSharp.BoundExpression firstRewrittenArgument) Zeile 741  C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitCall.__visitArgumentsAndFinishRewrite|151_1(Microsoft.CodeAnalysis.CSharp.BoundCall node, Microsoft.CodeAnalysis.CSharp.BoundExpression rewrittenReceiver) Zeile 384 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitCall(Microsoft.CodeAnalysis.CSharp.BoundCall node) Zeile 341 C#
    [Externer Code] 
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithoutStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 97   C#
    [Externer Code] 
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 92  C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpressionImpl(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 273 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpression(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 230 C#
    Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitInterpolatedString(Microsoft.CodeAnalysis.CSharp.BoundInterpolatedString node) Zeile 362 C#
AlekseyTs commented 6 days ago

This is great because we have multiple overloads, with and without params. Overload resolution chooses the best one. And this allows the same code for less than 4 parameters, too.

Well, as we can see this approach is fragile and/or lacks proper validation of the result that it produces.

AlekseyTs commented 6 days ago

Other parts of LocalRewriter that utilize similar approach could be vulnerable too.

AlekseyTs commented 6 days ago

I think it would be good to understand what overloads String.Format were available in .Net8 and which of them the previous version of the compiler could possibly pick at https://github.com/dotnet/roslyn/blob/3a3c108d86c3cbb00049bfc4f5f9a04a42fa4b1b/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringInterpolation.cs#L354