SteveGilham / altcover

Cross-platform coverage gathering and processing tool set for dotnet/.Net Framework and Mono
MIT License
498 stars 18 forks source link

AltCover hangs with dotnet.exe consuming 100% CPU when instrumenting assembly that uses ProtectedData API #173

Closed egraff closed 1 year ago

egraff commented 1 year ago

I have no clue what is going on, but I've tried to reduce the example I have to the smallest size possible.

Basically, if I try to run dotnet test -p:AltCover=true for a test project that references another project that contains code that uses certain parts of the System.Security.Cryptography.ProtectedData API, then AltCover hangs and never completes the instrumentation/preparation phase. Looking at the task manager shows a dotnet.exe process that consumes 100% of one CPU core.

The only output shown from the dotnet test command is this:

PS> dotnet test -p:AltCover=true
  Determining projects to restore...
  All projects are up-to-date for restore.
  SomeProject -> C:\projects\altcover-problem\SomeProject\bin\Debug\net48\SomeProject.dll
  SomeTestProject -> C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\SomeTestProject.dll
  Instrumenting files from C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\
  Writing files to C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\__Instrumented_SomeTestProject\
     => C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\AltCover.Monitor.dll
     => C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\SomeProject.dll
     => C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\SomeTestProject.dll

  Coverage Report: C:\projects\pexip-repos\mcu2\altcover-problem\SomeTestProject\coverage.xml

      C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\__Instrumented_SomeTestProject\SomeTestProject.dll
                  <=  SomeTestProject, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null
      C:\projects\altcover-problem\SomeTestProject\bin\Debug\net48\__Instrumented_SomeTestProject\SomeProject.dll
                  <=  SomeProject, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null

However, if I instead run the CLI tool:

......\packages\altcover\8.4.840\tools\net472\AltCover.exe -i .\bin\Debug\net48 -o .\bin\Debug\net48\instr

the command terminates without hanging!

System info:

OS: Windows 11 21H2 AltCover version: 8.4.840 .NET SDK version: 6.0.400

Detailed sources

My test setup has two projects in two different folders

Below are all the files from my test setup, reproduces exactly:

SomeProject\SomeProject.csproj:

<?xml version="1.0" encoding="utf-8"?>
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Library</OutputType>
    <TargetFramework>net48</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="System.Security.Cryptography.ProtectedData" Version="6.0.0" />
  </ItemGroup>
</Project>

SomeProject\SomeClass.cs:

namespace SomeProject
{
    using System.Security.Cryptography;

    public static class SomeClass
    {
        public static byte[] Protect(
            byte[] unprotectedData,
            DataProtectionScope scope = DataProtectionScope.CurrentUser
        )
        {
            return ProtectedData.Protect(unprotectedData, null, scope);
        }
    }
}

SomeTestProject\SomeTestProject.csproj:

<?xml version="1.0" encoding="utf-8"?>
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Library</OutputType>
    <TargetFramework>net48</TargetFramework>
    <TestProjectType>UnitTest</TestProjectType>
    <IsTestProject>true</IsTestProject>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="altcover" Version="8.4.840" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\SomeProject\SomeProject.csproj" />
  </ItemGroup>
</Project>

I'm also attaching the sources as a ZIP: altcover-problem.zip

and the compiled SomeTestProject as a ZIP: net48.zip

SteveGilham commented 1 year ago

Whatever it is, it's something runtime dependent, in terms of the system assemblies being referenced.

The issue reproduces cleanly under dotnet SDK 6.0.403 : the process spins at that point, whereas the .Net Framework tool completes promptly.

Running at dotnet SDK v 7.0.100, the dotnet test process instead dies fairly quickly with a stack overflow from the depths of Mono.Cecil which is performing some type resolution while trying to write the SomeProject assembly.

There is a known issue that the netstandard2.0 build of Mono.Cecil, used by the dotnet test integration looks to the dotnet runtime to resolve system assemblies, whereas the net40 build, as used by the net472 AltCover tool, goes to the GAC.

In this case, however, giving the .Net Framework dependencies explicitly via e.g. --dependency=C:\WINDOWS\Microsoft.NET\assembly\GAC_MSIL\System.Security\v4.0_4.0.0.0__b03f5f7f11d50a3a\System.Security.dll doesn't make any difference; presumably that's probably because the the type resolution has already found the relevant types in the dotnet runtime assemblies, so doesn't look to fallback locations.

Setting the test project to be net6.0 rather than net48 does run through dotnet test, with just the usual compatibility warning for linking net6.0 tests against net48 systems under test, even though the subject SomeProject is unchanged throughout.

My next step will be turning this into a simple test case for Mono.Cecil and, unless that points at a blindingly obvious workround that can be done outside that library, raising an issue there.

SteveGilham commented 1 year ago

Trying this morning to build a repro case for Cecil, I find that using a different computer, the original dotnet test -p:AltCover=true issue doesn't reproduce using either 6.0.403 or 7.0.100 SDKs; I'm not sure what is different about the two environments to have this effect.

SteveGilham commented 1 year ago

Further investigation on the computer that does repro the issue suggests to me that exactly which dotnet runtimes are installed may explain the different behaviours. At base, the issue is to do with resolving types in a net48 project in a netcoreapp2.0 but rolled forwards context - exactly which runtime that is will have a knock-on effect on which versions of system assemblies get brought into the type resolution process.

This is also why altcover.exe, or a net6.0 test assembly with dotnet test both work, since in both cases the compatible Cecil resolution strategy "just works" without getting into some unwitting circular dependency.

SteveGilham commented 1 year ago

TIL that System.Security is added as an implicit dependency to projects -- it may not show on the Solution Explorer view under project > dependencies > assemblies, but if you go to add a new dependency, you'll see it checked under Framework Assemblies.

If you make that dependency explicit, by adding

  <ItemGroup>
    <Reference Include="System.Security">
      <Private>True</Private>
    </Reference>
  </ItemGroup>

to your test project, then dotnet test -p:AltCover=true works, because the consistent version of System.Security is present where Cecil's assembly resolution strategy looks first, so it doesn't have to start guessing about which system runtime to use and then getting lost.

SteveGilham commented 1 year ago

The first difference with the machine that doesn't repro the issue is that one still has the Microsoft.NETCore.App 2.1.30 runtime installed; where as the machine that repros this issue has nothing older than 3.1. In the 2.1 environment, it turns out that Cecil doesn't have to go looking for the System.Security types when writing the assembly, so the whole circular look-up through available runtimes is avoided. I've not investigated the flow through the whole Cecil process to find out why, though.

Looking at the System.Security assemblies in the different runtimes, they all share the full name System.Security, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, so would satisfy the lookup; but the 3.1 runtimes all refer to System.Security.Cryptography.ProtectedData, Version=4.0.5.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a rather than System.Security.Cryptography.ProtectedData, Version=0.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a as at 2.1, 5.0 and later. I suspect that this non-wildcarded version may be ultimately responsible for the issue.

There is a tacit assumption in how AltCover manages the assembly resolution issue, in that it's expected that a .Net Framework build will drag all the dependencies into the build output, so that Cecil-on-dotnet's approach of seeking dependencies by

  1. look in the same folder as the current assembly
  2. if not found look in the current dotnet runtime
  3. if not found raise a resolve failure event (at which point AltCover goes looking in --dependency entries and the nuget cache)

will stop at step 1. Here, as with issue #156 we have a case where the tacit assumption is violated; but here because of name collision of System.Security with the current runtime, the process stalls at step 2.

SteveGilham commented 1 year ago

With the - now out of support - .net core 3.1 runtime removed from the machine that showed the issue, the issue no longer reproduces.

So solutions are

  1. remove the out-of-support 3.1 runtime
  2. get ahead of the 3.1 runtime in the search order by installing .net core 2.1
  3. use the framework command-line tool on framework binaries