Open SergeiPavlov opened 1 week ago
The string interpolation is handled internally as collection-expression conversion to a readonly span (a debug assert is also triggered). This is okay only in non expression trees.
Currently: System.String! System.String.Format(System.String! format, params System.ReadOnlySpan<System.Object?> args)
is selected to handle the string interpolation. The reason is because the code is essentially lowered to:
using System.Collections.Generic;
using System.Linq;
class Program
{
static IQueryable<string> GetStrings() => null;
public static void Main() =>
GetStrings().Select(o => string.Format("{0} {1} {2} {3}", o, o, o, o));
}
For expression trees we could lower it to (no ref struct allowed):
using System.Collections.Generic;
using System.Linq;
class Program
{
static IQueryable<string> GetStrings() => null;
public static void Main() =>
GetStrings().Select(o => string.Format("{0} {1} {2} {3}", new object[] { o, o, o, o }));
}
But it would be maybe better to handle it in the invocation binding.
It sounds like a usage of a non-array params collection is making its way to an expression tree without triggering an error.
DiagnosticsPass
is supposed to detect the usage and report an error like:
// (8,46): error CS9226: An expression tree may not contain an expanded form of non-array params collection parameter.
// Expression<System.Action> e1 = () => Test();
Diagnostic(ErrorCode.ERR_ParamsCollectionExpressionTree, "Test()").WithLocation(8, 46),
The following method is supposed to detect the situation:
public override BoundNode VisitCollectionExpression(BoundCollectionExpression node)
{
if (_inExpressionLambda)
{
Error(
node.IsParamsArrayOrCollection ?
ErrorCode.ERR_ParamsCollectionExpressionTree :
ErrorCode.ERR_ExpressionTreeContainsCollectionExpression,
node);
}
return base.VisitCollectionExpression(node);
}
At the moment there is no clarity how exactly a collection expression is getting through. Understanding that should be the next step in instigation of this issue.
@bernd5
The string interpolation is handled internally as collection-expression conversion to a readonly span (a debug assert is also triggered).
Could you please provide more details?
The string interpolation is handled internally as collection-expression conversion to a readonly span
Where exactly this handling is happening?
a debug assert is also triggered
What does assert check? Where it is in the code? What the call stack looks like?
The collection-conversion comes to play at (here we lower and assume overload resolution to pick the right one): https://github.com/dotnet/roslyn/blob/3a3c108d86c3cbb00049bfc4f5f9a04a42fa4b1b/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringInterpolation.cs#L354
This is great because we have multiple overloads, with and without params. Overload resolution chooses the best one. And this allows the same code for less than 4 parameters, too.
The problem arises now for the params overload because we have 2 of them and the span overload is better.
What does assert check? Where it is in the code? What the call stack looks like?
The callstack is:
> Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.RewriteCollectionExpressionConversion(Microsoft.CodeAnalysis.CSharp.Conversion conversion, Microsoft.CodeAnalysis.CSharp.BoundCollectionExpression node) Zeile 34 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitConversion(Microsoft.CodeAnalysis.CSharp.BoundConversion node) Zeile 66 C#
[Externer Code]
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithoutStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 97 C#
[Externer Code]
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 92 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpressionImpl(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 273 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpression(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 230 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitArgumentsAndCaptureReceiverIfNeeded(ref Microsoft.CodeAnalysis.CSharp.BoundExpression rewrittenReceiver, Microsoft.CodeAnalysis.CSharp.LocalRewriter.ReceiverCaptureMode captureReceiverMode, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.CSharp.BoundExpression> arguments, Microsoft.CodeAnalysis.CSharp.Symbol methodOrIndexer, System.Collections.Immutable.ImmutableArray<int> argsToParamsOpt, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.RefKind> argumentRefKindsOpt, Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder<Microsoft.CodeAnalysis.CSharp.BoundExpression> storesOpt, ref Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder<Microsoft.CodeAnalysis.CSharp.Symbols.LocalSymbol> tempsOpt, Microsoft.CodeAnalysis.CSharp.BoundExpression firstRewrittenArgument) Zeile 741 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitCall.__visitArgumentsAndFinishRewrite|151_1(Microsoft.CodeAnalysis.CSharp.BoundCall node, Microsoft.CodeAnalysis.CSharp.BoundExpression rewrittenReceiver) Zeile 384 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitCall(Microsoft.CodeAnalysis.CSharp.BoundCall node) Zeile 341 C#
[Externer Code]
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithoutStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 97 C#
[Externer Code]
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.BoundTreeRewriterWithStackGuard.VisitExpressionWithStackGuard(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 92 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpressionImpl(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 273 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitExpression(Microsoft.CodeAnalysis.CSharp.BoundExpression node) Zeile 230 C#
Microsoft.CodeAnalysis.CSharp.dll!Microsoft.CodeAnalysis.CSharp.LocalRewriter.VisitInterpolatedString(Microsoft.CodeAnalysis.CSharp.BoundInterpolatedString node) Zeile 362 C#
This is great because we have multiple overloads, with and without params. Overload resolution chooses the best one. And this allows the same code for less than 4 parameters, too.
Well, as we can see this approach is fragile and/or lacks proper validation of the result that it produces.
Other parts of LocalRewriter
that utilize similar approach could be vulnerable too.
I think it would be good to understand what overloads String.Format
were available in .Net8 and which of them the previous version of the compiler could possibly pick at https://github.com/dotnet/roslyn/blob/3a3c108d86c3cbb00049bfc4f5f9a04a42fa4b1b/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringInterpolation.cs#L354
Version Used: .NET 9 SDK 9.0.100-preview.5.24307.3 tried also Preview.6 and today's Preview.7
Steps to Reproduce: Following code:
with
.csproj
:Diagnostic Id:
Expected Behavior: Successful build
Actual Behavior: Compiler crash (does work without
<LangVersion>Preview</LangVersion>
setting)