stryker-mutator / stryker-net

Mutation testing for .NET core and .NET framework!
https://stryker-mutator.io
Apache License 2.0
1.76k stars 175 forks source link

Error when running on an EKS cluster #2488

Closed phmonte closed 8 months ago

phmonte commented 1 year ago

Describe the bug I'm running stryker inside an eks container and I get the following error.

Logs [22:55:28 INF] Logging enabled at level Trace Version: 3.5.0

[22:55:29 VRB] CACHE https://api.nuget.org/v3/registration5-gz-semver2/dotnet-stryker/index.json A new version of Stryker.NET (3.7.1) is available. Please consider upgrading using dotnet tool update -g dotnet-stryker

[22:55:29 DBG] Stryker started with options: {"MsBuildPath": null, "DevMode": false, "ProjectPath": "/azp/marketing", "IsSolutionContext": true, "WorkingDirectory": "/azp/marketing", "OutputPath": "/azp/marketing/StrykerOutput /2023-04-27.22-55-28", "ReportPath": "/azp/marketing/StrykerOutput/2023-04-27.22-55-28/reports", "ReportFileName": "mutation-report", "SolutionPath": "/azp/marketing/JSM.Marketing.sln", "TargetFramework": null, "LogOptions": { "LogToFile": true, "LogLevel": "Verbose", "$type": "LogOptions"}, "MutationLevel": "Basic", "Thresholds": {"High": 80, "Low": 60, "Break": 0, "$type": "Thresholds"}, "AdditionalTimeout": 5000, "LanguageVersion": "Default", "Co ncurrency": 1, "ProjectUnderTestName": "JSM.Marketing.Domain", "TestProjects": ["/azp/marketing/JSM.Marketing.Test"], "TestCaseFilter": "", "Reporters": ["Progress", "Html"], "WithBaseline": false, "BaselineProvider": "Disk", "AzureFileStorageUrl": "", "AzureFileStorageSas": "", "DashboardUrl": "https://dashboard.stryker-mutator.io", "DashboardApiKey": null, "Since": false, "SinceTarget": "master", "DiffIgnoreChanges": [], "FallbackVersion": "maste r", "ModuleName": "", "ReportTypeToOpen": null, "Mutate": [{"Glob": {"Tokens": [{"TrailingPathSeparator": {"Value": "/", "$type": "PathSeparatorToken"}, "LeadingPathSeparator": null, "$type": "WildcardDirectoryToken"}, {"$type ": "WildcardToken"}], "$type": "Glob"}, "IsExclude": false, "TextSpans": [{"Start": 0, "End": 2147483647, "Length": 2147483647, "IsEmpty": false, "$type": "TextSpan"}], "$type": "FilePattern"}], "IgnoredMethods": [], "Excluded Mutations": [], "ExcludedLinqExpressions": [], "OptimizationMode": "CoverageBasedTest", "ProjectName": "", "ProjectVersion": "", "BreakOnInitialTestFailure": false, "$type": "StrykerOptions"} [22:55:30 INF] Identifying projects to mutate in /azp/marketing/JSM.Marketing.sln. This can take a while. [22:55:30 DBG] Analysing 5 projects [22:55:30 DBG] Analysing /azp/marketing/JSM.Marketing.Domain/JSM.Marketing.Domain.csproj [22:55:30 DBG] Analysing /azp/marketing/JSM.Marketing.Infrastructure/JSM.Marketing.Infrastructure.csproj [22:55:30 DBG] Analysing /azp/marketing/JSM.Marketing.IoC/JSM.Marketing.IoC.csproj [22:55:30 DBG] Analysing /azp/marketing/JSM.Marketing.Api/JSM.Marketing.Api.csproj [22:55:31 DBG] Analysing /azp/marketing/JSM.Marketing.Test/JSM.Marketing.Test.csproj [22:55:35 INF] Time Elapsed 00:00:06.2003890 Unhandled exception. System.AggregateException: One or more errors occurred. (Could not find build environment) (Could not find build environment) (Could not find build environment) (Could not find build environment) (Could no t find build environment) ---> System.InvalidOperationException: Could not find build environment at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework, EnvironmentOptions options) at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build() at Stryker.Core.Initialisation.ProjectOrchestrator.<>cDisplayClass7_0.b0(IProjectAnalyzer project) at System.Threading.Tasks.Parallel.<>cDisplayClass33_0`2.b0(Int32 i) at System.Threading.Tasks.Parallel.<>cDisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.b1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica1.ExecuteAction(Boolean& yieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica.Execute() --- End of inner exception stack trace --- at Stryker.Core.Initialisation.ProjectOrchestrator.AnalyzeSolution(StrykerOptions options) at Stryker.Core.Initialisation.ProjectOrchestrator.MutateProjects(StrykerOptions options, IReporter reporters)+MoveNext() at System.Collections.Generic.List1..ctor(IEnumerable1 collection) at System.Linq.Enumerable.ToList[TSource](IEnumerable1 source) at Stryker.Core.StrykerRunner.RunMutationTest(IStrykerInputs inputs, ILoggerFactory loggerFactory, IProjectOrchestrator projectOrchestrator) at Stryker.CLI.StrykerCli.RunStryker(IStrykerInputs inputs) in /_/src/Stryker.CLI/Stryker.CLI/StrykerCLI.cs:line 93 at Stryker.CLI.StrykerCli.<>cDisplayClass10_0.b_0() in //src/Stryker.CLI/Stryker.CLI/StrykerCLI.cs:line 68 at McMaster.Extensions.CommandLineUtils.CommandLineApplication.<>cDisplayClass143_0.b0(CancellationToken ) at McMaster.Extensions.CommandLineUtils.CommandLineApplication.ExecuteAsync(String[] args, CancellationToken cancellationToken) at McMaster.Extensions.CommandLineUtils.CommandLineApplication.Execute(String[] args) at Stryker.CLI.StrykerCli.Run(String[] args) in //src/Stryker.CLI/Stryker.CLI/StrykerCLI.cs:line 74 at Stryker.CLI.Program.Main(String[] args) in /_/src/Stryker.CLI/Stryker.CLI/Program.cs:line 14 ---> (Inner Exception #1) System.InvalidOperationException: Could not find build environment at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework, EnvironmentOptions options) at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build() at Stryker.Core.Initialisation.ProjectOrchestrator.<>c__DisplayClass7_0.b0(IProjectAnalyzer project) at System.Threading.Tasks.Parallel.<>cDisplayClass33_02.<ForEachWorker>b__0(Int32 i) at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.b1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica1.ExecuteAction(Boolean& yieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica.Execute()<---

---> (Inner Exception #2) System.InvalidOperationException: Could not find build environment at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework, EnvironmentOptions options) at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build() at Stryker.Core.Initialisation.ProjectOrchestrator.<>cDisplayClass7_0.b0(IProjectAnalyzer project) at System.Threading.Tasks.Parallel.<>cDisplayClass33_0`2.b0(Int32 i) at System.Threading.Tasks.Parallel.<>cDisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.b1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica`1.ExecuteAction(Boolean& yieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica.Execute()<---

---> (Inner Exception #3) System.InvalidOperationException: Could not find build environment at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework, EnvironmentOptions options) at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build() at Stryker.Core.Initialisation.ProjectOrchestrator.<>cDisplayClass7_0.b0(IProjectAnalyzer project) at System.Threading.Tasks.Parallel.<>cDisplayClass33_0`2.b0(Int32 i) at System.Threading.Tasks.Parallel.<>cDisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.b1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica`1.ExecuteAction(Boolean& yieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica.Execute()<---

---> (Inner Exception #4) System.InvalidOperationException: Could not find build environment at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework, EnvironmentOptions options) at Buildalyzer.Environment.EnvironmentFactory.GetBuildEnvironment(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build(String targetFramework) at Buildalyzer.ProjectAnalyzer.Build() at Stryker.Core.Initialisation.ProjectOrchestrator.<>cDisplayClass7_0.b0(IProjectAnalyzer project) at System.Threading.Tasks.Parallel.<>cDisplayClass33_0`2.b0(Int32 i) at System.Threading.Tasks.Parallel.<>cDisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- at System.Threading.Tasks.Parallel.<>c__DisplayClass19_01.b1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica`1.ExecuteAction(Boolean& yieldedBeforeCompletion) at System.Threading.Tasks.TaskReplicator.Replica.Execute()<---

Expected behavior A clear and concise description of what you expected to happen.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

rouke-broersma commented 1 year ago

The container image you're using uses an installation of dotnet that's configured in a non standard way, which makes one of our dependencies (buildalyzer) unable to detect the dotnet cli and/or msbuild.

Are you using a publicly available image?

phmonte commented 1 year ago

Is it possible to pass the address of msbuild/dotnet by parameter or environment variable?

Docker Image: https://hub.docker.com/repository/docker/phmonte/stryker-base/general

Dockerfile that generated the image:


# Use the official Ubuntu 20.04 LTS as the base image
FROM ubuntu:20.04

# Update the package lists
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y wget gnupg

# Install .NET 6 SDK
RUN wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb && \
    dpkg -i packages-microsoft-prod.deb && \
    apt-get update && \
    apt-get install -y apt-transport-https && \
    apt-get update && \
    apt-get install -y dotnet-sdk-6.0

# Install Docker
RUN apt-get install -y docker.io

# Install .NET Stryker
RUN dotnet tool install -g dotnet-stryker

# Install Azure CLI
RUN apt-get update && \
    apt-get install -y curl gnupg lsb-release && \
    curl -sL https://packages.microsoft.com/keys/microsoft.asc | \
        gpg --dearmor | \
        tee /etc/apt/trusted.gpg.d/microsoft.asc.gpg > /dev/null && \
    AZ_REPO=$(lsb_release -cs) && \
    echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | \
        tee /etc/apt/sources.list.d/azure-cli.list && \
    apt-get update && \
    apt-get install -y azure-cli

# Instala o AWS CLI
RUN apt-get update && \
    apt-get install -y awscli

# Set the working directory to /app
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Run the command when the container starts
CMD ["bash"]
rouke-broersma commented 1 year ago

That is not currently possible, however I don't see any issues with your dockerfile, don't know why Buidalyzer can't find dotnet. Just checking, dotnet is in your path?

phmonte commented 1 year ago

Can you see any issues with the sdk and runtime path?

dotnet --list-sdks
6.0.408 [/usr/share/dotnet/sdk]
dotnet --list-runtimes
Microsoft.AspNetCore.App 6.0.16 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 6.0.16 [/usr/share/dotnet/shared/Microsoft.NETCore.App] 

dotnet --info
.NET SDK (reflecting any global.json):
 Version:   6.0.408
 Commit:    0c3669d367

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  20.04 
 OS Platform: Linux 
 RID:         ubuntu.20.04-x64
 Base Path:   /usr/share/dotnet/sdk/6.0.408/

global.json file:
  Not found

Host:
  Version:      6.0.16
  Architecture: x64
  Commit:       1e620a42e7

.NET SDKs installed:
  6.0.408 [/usr/share/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 6.0.16 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 6.0.16 [/usr/share/dotnet/shared/Microsoft.NETCore.App]      

Download .NET:
  https://aka.ms/dotnet-download

Learn about .NET Runtimes and SDKs:
  https://aka.ms/dotnet/runtimes-sdk-info
rouke-broersma commented 1 year ago

Nope those look alright to me

The dependency we use is https://github.com/daveaglick/Buildalyzer. It is very difficult to debug these kind of environment issues locally. Perhaps you could debug this yourself inside the container with a small console app that implements the buildalyzer package? Since the issue is finding the dotnet cli in the first place you shouldn't need to do more than the very basic example in the Buildalyzer readme to reproduce the issue.

phmonte commented 1 year ago

I will try to reproduce the problem following your tips and I will post the updates here.

Thank you very much

phmonte commented 1 year ago

I believe I found the problem, following your tips @rouke-broersma

Today, when running the dotnet --info command, Buildalyzer waits 4000ms to retrieve the sdk path, in my case, the cluster is taking 5~6 seconds.

https://github.com/daveaglick/Buildalyzer/blob/5b732a8ce572efbba077dd08871fb609c69f94ce/src/Buildalyzer/Environment/DotnetPathResolver.cs#L63

For my case, I cached the dotnet --info to a file, which will solve my problem temporarily.

Do you believe it makes sense to open a PR for the Buildalyzer repository with a possible fix? I thought of something like: image

rouke-broersma commented 1 year ago

I think you can best ask the buildalyzer developer whether or not that is a satisfactory solution, and if not, what we should do instead.

But awesome that you found something!

phmonte commented 1 year ago

@rouke-broersma I created the issue in the Buildalyzer repository, in my case the cache didn't work because the Process doesn't call the .bashrc file where I created the cache, so I generated a new build of Stryker using an altered version of Buildalyzer and now it works. Let's wait for the Buildalyzer team, if they approve the change I can do the PR on Buildalyzer and Stryker with the correction.

PS: It took me hours to understand why my version didn't work, your post and evaluating the pipeline saved me (about vstest files).

https://github.com/microsoft/vstest/issues/1948

rouke-broersma commented 8 months ago

@phmonte We'll release a new version including an updated Buildalyzer which contains your improvement shortly. This means that in the next version you should be able to set DOTNET_INFO_WAIT_TIME to solve your issues with the large/slow build environment.