dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.85k stars 4.01k forks source link

Bug in Data Flow Analysis on FixedStatementSyntax #60402

Open KrisVandermotten opened 2 years ago

KrisVandermotten commented 2 years ago

Version Used:

Observed in Roslyn 3.8.0 and later versions. The problem may have existed before 3.8.0 as well.

Steps to Reproduce:

Consider the following code:

class C
{
    unsafe void M()
    {
        byte[] buffer = new byte[1024];

        fixed (byte* ptr = buffer)
        {
        }
    }
}

Run dataflow analysis on the fixedStatementSyntax.Declaration.Variables[0].Initializer.Value, i.e. the expression buffer.

That is, assuming semanticModel is a SemanticModel for that syntax tree, execute

var dfa = semanticModel.AnalyzeDataFlow(fixedStatementSyntax.Declaration.Variables[0].Initializer.Value);

Expected Behavior:

dfa.ReadInside.Length == 1 && dfa.WrittenInside.Length == 0

Actual Behavior:

dfa.ReadInside.Length == 0 && dfa.WrittenInside.Length == 1

Clearly, the variable buffer is being read: it is checked to be not null, and it is being derefenced to pin the array.

Why is dfa.ReadInside.Length == 0 ?

Why is dfa.WrittenInside.Length == 1 ? Why is it that data flow analysis reports that the variable buffer is being written?

wmjordan commented 2 years ago

Clearly, the variable buffer is being read: it is checked to be not null, and it is being derefenced to pin the array.

Why is dfa.ReadInside.Length == 0 ?

I think this is a bug either. buffer is obviously read.

Why is dfa.WrittenInside.Length == 1 ? Why is it that data flow analysis reports that the variable buffer is being written?

Writing could happen if we manipulate the pointer, for instance, *ptr = (byte)1. However, the snippet here did nothing to ptr.

jcouv commented 2 years ago

This is wrong indeed. Two things seem wrong in the OP scenario: 1. we're not registering a read to the local buffer from fixed initializer, 2. we're registering a write to it instead. It also affects definite assignment (we're missing a definite assignment error, see test below).

Here's the spec on fixed statement,

A fixed_pointer_initializer can be one of the following: [...]

  • An expression of an array_type with elements of an unmanaged type T, provided the type T* is implicitly convertible to the pointer type given in the fixed statement. In this case, the initializer computes the address of the first element in the array, and the entire array is guaranteed to remain at a fixed address for the duration of the fixed statement. If the array expression is null or if the array has zero elements, the initializer computes an address equal to zero.

Tagging @jaredpar @vsadov to advise since they may have some context (relates to PR https://github.com/dotnet/roslyn/pull/20548).

Relevant code is DefiniteAssignment.VisitFixedLocalCollectionInitializer and AbstractFlowPass.VisitAddressOfOperand.

        [Fact, WorkItem(60402, "https://github.com/dotnet/roslyn/issues/60402")]
        public void TODO2()
        {
            var source = @"
class C
{
    unsafe void M()
    {
        byte[] buffer;

        fixed (byte* ptr = /*<bind>*/ buffer /*</bind>*/ ) // TODO2 why no definite assignment error for reading from `buffer`?
        {
        }
    }
}
";
            var comp = CreateCompilation(source, options: TestOptions.UnsafeDebugDll);
            comp.VerifyDiagnostics();
        }

        [Fact, WorkItem(60402, "https://github.com/dotnet/roslyn/issues/60402")]
        public void TODO2_2()
        {
            var source = @"
class C
{
    unsafe void M()
    {
        byte[] buffer = new byte[1024];

        fixed (byte* ptr = /*<bind>*/ buffer /*</bind>*/ )
        {
        }
    }
}
";

            var dataFlowAnalysisResults = CompileAndAnalyzeDataFlowExpression(source);
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.VariablesDeclared));
            Assert.Equal("buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.AlwaysAssigned));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.DataFlowsIn)); // TODO2 buffer not considered read
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.DataFlowsOut));
            Assert.Equal("this, buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.DefinitelyAssignedOnEntry));
            Assert.Equal("this, buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.DefinitelyAssignedOnExit));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.ReadInside)); // TODO2 buffer not considered read
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.ReadOutside));
            Assert.Equal("buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.WrittenInside)); // TODO2 why is it considered written inside?
            Assert.Equal("this, buffer, ptr", GetSymbolNamesJoined(dataFlowAnalysisResults.WrittenOutside));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.Captured));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.CapturedInside));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.CapturedOutside));
        }

        [Fact, WorkItem(60402, "https://github.com/dotnet/roslyn/issues/60402")]
        public void TODO2_3()
        {
            var source = @"
class C
{
    unsafe void M()
    {
        byte buffer;
        byte* ptr = /*<bind>*/ &buffer /*</bind>*/;
    }
}
";
            var dataFlowAnalysisResults = CompileAndAnalyzeDataFlowExpression(source);
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.VariablesDeclared));
            Assert.Equal("buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.AlwaysAssigned));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.DataFlowsIn));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.DataFlowsOut));
            Assert.Equal("this", GetSymbolNamesJoined(dataFlowAnalysisResults.DefinitelyAssignedOnEntry));
            Assert.Equal("this, buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.DefinitelyAssignedOnExit));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.ReadInside)); // TODO2 not considered read inside
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.ReadOutside));
            Assert.Equal("buffer", GetSymbolNamesJoined(dataFlowAnalysisResults.WrittenInside)); // TODO2 but why is it considered written inside?
            Assert.Equal("this, ptr", GetSymbolNamesJoined(dataFlowAnalysisResults.WrittenOutside));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.Captured));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.CapturedInside));
            Assert.Null(GetSymbolNamesJoined(dataFlowAnalysisResults.CapturedOutside));
        }
wmjordan commented 2 years ago

Previous versions are also affected by this issue. I still have VS 2017 installed on my machine and I can reproduce this issue as well.

KrisVandermotten commented 2 years ago

@jcouv

Here's the spec on fixed statement,

Do note that this bug also surfaces with strings.

  • An expression of type string, provided the type char* is implicitly convertible to the pointer type given in the fixed statement. In this case, the initializer computes the address of the first character in the string, and the entire string is guaranteed to remain at a fixed address for the duration of the fixed statement. The behavior of the fixed statement is implementation-defined if the string expression is null.

It might be a good idea to add unit tests for those as well.