dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19k stars 4.03k forks source link

[Regression from VS2013 compiler] Incorrect RVAs emitted for identical method bodies #2207

Closed mavasani closed 9 years ago

mavasani commented 9 years ago

1) Compile below C# code into a dll, i.e. csc /t:library temp.cs

// temp.cs
public class C
{
    private void Method1(int x)
    {
        System.Console.WriteLine(x);
    }

    private void Method2(int x)
    {
        System.Console.WriteLine(x);
    }
}

2) ildasm temp.dll and note the RVAs emitted for Method1 and Method2

Actual: Both methods have same start RVA dev14_start_rva

Expected: Both methods have distinct start RVAs. This is a regression from VS2013 compiler, where both methods do have distinct start RVAs: dev12_start_rva

See internal bug 1131168 for a regression in FxCop's Phoenix based dataflow engine (and subsequently all FXCop dataflow rules) due to this incorrect metadata emit.

mavasani commented 9 years ago

Looking more carefully, this may actually be an optimization done here: http://source.roslyn.io/#Microsoft.CodeAnalysis/PEWriter/MetadataWriter.cs,4098

Seems like we emit a single IL blob for both methods. So this might be an optimization to reduce binary size?

mavasani commented 9 years ago

Re-opening, closed by mistake. Do we want to do this optimization only for release/optimized builds?

VSadov commented 9 years ago

Yes, this is an optimization to remove redundant methods form PE. So far we have not seen reasons to do a different codegen for Debug here. It seems that FxCop is working fine here and the only effect is that the same error is not reported twice.

VSadov commented 9 years ago

If a duplicate error is desired for testing purposes, There are trivial workarounds - just make methods different. For example it can be -

var  i = input;
response.Write(i);
mavasani commented 9 years ago

@VSadov that is correct for simple FXCop rules. However, for FXCop/SDL dataflow rules based on Phoenix analysis engine, we end up skipping analysis completely for all methods, except one. This leads to incorrect call graph and hence incorrect analysis. I am working on trying to fix the Phoenix pereader to handle this case of multiple methods sharing same start RVA, so that such symbols can share identical function units for analysis and we generate a correct callgraph.

Yes, the SDL dataflow rule tests that regressed with new compiler can be fixed /changed, they are specifically validating that the errors are fired for each identical method, but the new behavior also seems reasonable as we will fire on at least one method. They are more concerned about broken dataflow/callgraph.

Feel free to resolve this as by design, I was just curious whether we un-intentionally enabled this analysis for debug builds.

VSadov commented 9 years ago

Ok. I am going to resolve as ByDesign. Let me know if a solution on your side is not possible. We can disable the optimization for /o- builds. The distinction is visible only when binary is interrogated through very low level APIs. Sharing method bodies is generally completely transparent to JITs or decompilers since they normally associate methods with their metadata and not with RVAs for their bodies.

mavasani commented 9 years ago

Yes, agreed. Phoenix PE reader uses the CLR MetadataImport APIs to fetch RVAs for symbols through this API: https://msdn.microsoft.com/en-us/library/ms233163(v=vs.110).aspx and goes on constructing functions units based on unique start RVAs, hence the issue.