stryker-mutator / stryker-net

Mutation testing for .NET core and .NET framework!
https://stryker-mutator.io
Apache License 2.0
1.78k stars 188 forks source link

Random crashes with FileNotFoundException #1659

Closed psfinaki closed 2 years ago

psfinaki commented 3 years ago

Describe the bug

Sometimes Stryker crashes for random projects with this error

error

This happens after throwing a bunch of exceptions with the FileNotFoundException on the top, see logs attached. The process exit code is -532462766.

It's important to say this is by far not deterministic. I get a feeling we have this once every 10 times or so. I have just run Stryker against the same project 20 times and it failed 3 times with this thing.

Logs

Here are the logs: logs.txt

They are pretty similar for these crashes. Once the log also had this attached at the end of the call stack (just before Time Elapsed TIME):

   at Stryker.CLI.StrykerCLI.RunStryker(StrykerOptions options) in D:\a\1\s\src\Stryker.CLI\Stryker.CLI\StrykerCLI.cs:line 136
   at Stryker.CLI.StrykerCLI.<>c__DisplayClass7_0.<Run>b__0() in D:\a\1\s\src\Stryker.CLI\Stryker.CLI\StrykerCLI.cs:line 125
   at McMaster.Extensions.CommandLineUtils.CommandLineApplication.<>c__DisplayClass146_0.<OnExecute>b__0(CancellationToken _)
   at McMaster.Extensions.CommandLineUtils.CommandLineApplication.ExecuteAsync(String[] args, CancellationToken cancellationToken)
   at McMaster.Extensions.CommandLineUtils.CommandLineApplication.Execute(String[] args)
   at Stryker.CLI.StrykerCLI.Run(String[] args) in D:\a\1\s\src\Stryker.CLI\Stryker.CLI\StrykerCLI.cs:line 128
   at Stryker.CLI.Program.Main(String[] args) in D:\a\1\s\src\Stryker.CLI\Stryker.CLI\Program.cs:line 16

Expected behavior

This shouldn't happen.

Desktop:

richardwerkman commented 3 years ago

Hmm this doesn't look good. We get the references directly from Buildalyzer. So it seems that buildalyzer returns a wrong path sometimes? We might have to look into how buildalyzer gets those references projects.

rouke-broersma commented 3 years ago

It seems unlikely that buildalyzer is returning the wrong path here. More seems like perhaps it was cleaned up by something.

@psfinaki can you confirm whether the given path is where your project dll should exist?

psfinaki commented 3 years ago

Absolutely. And I mean... I don't change anything. I literally put together a script to run Stryker the same way 20 times and it went like all good all good all good all good all good BAMMMM all good all good BAMMMM all good all good ...

You get the point :D

rouke-broersma commented 3 years ago

Is this a multi targeted project? Or do you use multi (test) project? I feel like buildalyzer or something else is doing a clean somewhere in between causing a race condition.

psfinaki commented 3 years ago

This particular one I was running Stryker against for 20 times is multi targeted, yes :) In general - I can try to collect some stats on that tomorrow.

rouke-broersma commented 3 years ago

Currently on multi targeted projects we use the first result buildalyzer has. We have seen some reports before that this is not deterministic (but trends to one over the other) and that stryker sometimes has issues with the 'other framework' result. We really need to make the target framework selectable and build in some heuristics to select the best possible target framework by default.

psfinaki commented 3 years ago

Yeah I will take this during the upcoming weeks. But if you want I can probably hack things around here, like handle this particular thing in the code. What do you think?

rouke-broersma commented 3 years ago

What's your idea for handling this?

psfinaki commented 3 years ago

Well as per my observations, reruns help in 100% cases :D But I don't know if it's easy to rerun the whole thing from within the Stryker execution.

dupdob commented 3 years ago

I'm interested in fixing this. It seems difficult to reproduce, so the logs are valuable. Quick question: is this path ok to you, @psfinaki ? C:\code\REPOSITORYNAME\test\PROJECTCATEGORYNAME\ ? It looks like something which failed env variables expansion?

rouke-broersma commented 3 years ago

@dupdob I think that's just to obfuscate the specific project names they are working on as it's replaced like that everywhere in the logs.

psfinaki commented 3 years ago

@dupdob, yes, that's just find-and-replace for privacy and confidentiality so that they don't throw me into prison :D

dupdob commented 3 years ago

Thanks. So, you are mutating PROJECTNAME, right? And the test seems to fail because Stryker cannot find DEPENDENTPROJECTNAME assembly. Is there any other concurrent processes running on this machine when this happens? like a VS(Code?) with DEPENDENTPROJECTNAME open ? or another Stryker session? If not, could you try using fuslogvw (see here ? it will give a more detailed reason of why the assembly could not be loaded; it may simply because it is missing (the next question is why, obviously), but it may be some more subtle cause, like an exception in a while loading a type....

psfinaki commented 3 years ago

Hmm well this happens also on the build server where technically there shouldn't be many processes involved. As for the util, I will try and update you :)

dupdob commented 3 years ago

well, if a build happens on DEPENDENTPROJECTNAME while mutating PROJECTNAME.... thanks for your support

psfinaki commented 3 years ago

@dupdob Okay so I finally got back to that and reproduced the whole thing using that fuslogvw utility.

Here are the logs... To be honest, I am not sure if this is 100% related although looks somewhat relevant. I don't know how to interpret that really though, as this is the first time I'm using this tool. But - maybe you're a guru and it somehow helps :D MSBuild.exe.zip

dupdob commented 3 years ago

Thank you. I think I need this kind of log files but for vstest.console. Those log files point to binding problem for MsBuid. Do you have any other log files?

psfinaki commented 3 years ago

Yes, here they are :) vstest.console.exe.zip

dupdob commented 3 years ago

I had a look, but I see no actual issues in those logs. So this is either a trace for a non crashing session, or this is kept in another process (vstesthost maybe) or this is not captured at all (sad panda then)

rouke-broersma commented 3 years ago

@psfinaki Does this still occur? Not sure if this is fixed now that you can select the target framework ( thank for you for contributing this :) ).

psfinaki commented 3 years ago

Hmm hang on, we've just moved to .NET 6 and C# 10 so have a lot of bigger issues related to that - but I will update this ticket once things stabilize a bit :)

psfinaki commented 2 years ago

So, looks like this has entirely disappeared since we moved to Stryker 1.0. Target framework is most probably unrelated, we've run Stryker against the whole code base a few times between moving to 1.0 and specifying the target framework.

So anyway - good stuff :) I'll close the issue then.