xunit / visualstudio.xunit

VSTest runner for xUnit.net (for Visual Studio Test Explorer and dotnet test)
https://xunit.net/
Other
146 stars 81 forks source link

Update xunit.runner.visualstudio to 2.5.5 triggers ObjectDisposedExceptions in CI #396

Closed fredrikcarlbom closed 10 months ago

fredrikcarlbom commented 10 months ago

Hello!

We are running an extensive test suite each commit on our CI machines (gitlab agents running in docker). After updating xunit.runner.visualstudio from 2.5.4 to 2.5.5 we started to see more than 50% of our CI test runs failing with ObjectDisposedException for some of our test projects. Running the same on developer machines in WSL has a failure rate of <1% and we have not been able to reproduce it at all in Windows. The entire error that is reported is:

<testsuites>
  <testsuite name="ProjectName.Tests.dll" tests="104" skipped="0" failures="0" errors="0" time="2.3332755" timestamp="2023-12-14T10:56:21" hostname="runner-khqhuyy2-project-1531-concurrent-0" id="0" package="ProjectName.Tests.dll">
    <properties />
    ...
    <system-out>Test Framework Informational Messages:
[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.5.5+28b46ff510 (64-bit .NET 7.0.13)
[xUnit.net 00:00:00.13]   Discovering: ProjectName.Tests
[xUnit.net 00:00:00.28]   Discovered:  ProjectName.Tests
[xUnit.net 00:00:00.28]   Starting:    ProjectName.Tests
[xUnit.net 00:00:01.32]   Finished:    ProjectName.Tests
[xUnit.net 00:00:01.33]       System.ObjectDisposedException : Safe handle has been closed.
[xUnit.net 00:00:01.33]       Object name: 'SafeHandle'.
[xUnit.net 00:00:01.33]       Stack Trace:
[xUnit.net 00:00:01.33]            at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean&amp; success)
[xUnit.net 00:00:01.33]            at Interop.Kernel32.SetEvent(SafeWaitHandle handle)
[xUnit.net 00:00:01.33]            at System.Threading.EventWaitHandle.Set()
[xUnit.net 00:00:01.33]         /_/src/xunit.runner.utility/Extensions/MessageSinkMessageExtensions.cs(39,0): at MessageSinkMessageExtensions.Dispatch[TMessage](IMessageSinkMessage message, HashSet`1 messageTypes, MessageHandler`1 callback)
</system-out>
    <system-err>Error - [xUnit.net 00:00:01.33]     [FATAL ERROR] System.ObjectDisposedException
</system-err>
  </testsuite>
</testsuites>

I expect that we are doing something wrong but would appreciate if anyone has suggestions how to troubleshoot this. I'd also appreciate if it would be possible to add additional information to the error message shown above.

One of the problematic test projects


<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>net7.0</TargetFramework>
    <RootNamespace>ProjectName.Tests</RootNamespace>
    <AssemblyName>ProjectName.Tests</AssemblyName>
    <ErrorOnDuplicatePublishOutputFiles>false</ErrorOnDuplicatePublishOutputFiles>
    <CodeAnalysisRuleSet>$(MSBuildThisFileDirectory)..\..\..\test-project.ruleset</CodeAnalysisRuleSet>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.8.0" />
    <PackageReference Include="xunit" Version="2.6.3" />
    <PackageReference Include="FluentAssertions" Version="6.12.0" />
    <PackageReference Include="xunit.runner.visualstudio" Version="2.5.5">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
      <PrivateAssets>all</PrivateAssets>
    </PackageReference>
    <PackageReference Include="coverlet.collector" Version="6.0.0">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
      <PrivateAssets>all</PrivateAssets>
    </PackageReference>
  </ItemGroup>

  <ItemGroup>
    <Content Include="SampleData\**">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </Content>
  </ItemGroup>

  <ItemGroup>
    <ProjectReference Include="..\..\..\Shared\TestHelpers\TestHelpers.csproj" />
    <ProjectReference Include="..\Service\ProjectName.csproj" />
  </ItemGroup>
</Project>```
PetrPospisil commented 10 months ago

Hi, we have similar problem. We also encounter Unhandled exception. System.ObjectDisposedException: Safe handle has been closed. update from v2.5.3 to v2.5.5.

It is happening extremely random. We use windows server 2022 virtual machines to run our CI tests.

Full stack is

...
The active test run was aborted. Reason: Test host process crashed : Unhandled exception. System.ObjectDisposedException: Safe handle has been closed.
Object name: 'SafeHandle'.
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at Interop.Kernel32.SetEvent(SafeWaitHandle handle)
   at System.Threading.EventWaitHandle.Set()
   at Xunit.Runner.VisualStudio.VsExecutionSink.HandleTestAssemblyFinished(MessageHandlerArgs`1 args) in /_/src/xunit.runner.visualstudio/Sinks/VsExecutionSink.cs:line 108
   at Xunit.ExecutionEventSink.OnMessageWithTypes(IMessageSinkMessage message, HashSet`1 messageTypes) in /_/src/xunit.runner.utility/Sinks/EventSinks/ExecutionEventSink.cs:line 167
   at Xunit.AggregateMessageSink.OnMessageWithTypes(IMessageSinkMessage message, HashSet`1 messageTypes) in /_/src/xunit.runner.utility/Sinks/AggregateMessageSink.cs:line 65
   at Xunit.Runner.VisualStudio.VsExecutionSink.OnMessageWithTypes(IMessageSinkMessage message, HashSet`1 messageTypes) in /_/src/xunit.runner.visualstudio/Sinks/VsExecutionSink.cs:line 319
   at Xunit.ExecutionSink.OnMessageWithTypes(IMessageSinkMessage message, HashSet`1 messageTypes) in /_/src/xunit.runner.utility/Sinks/ExecutionSink.cs:line 444
   at Xunit.Sdk.SynchronousMessageBus.QueueMessage(IMessageSinkMessage message) in /_/src/xunit.execution/Sdk/SynchronousMessageBus.cs:line 30
   at Xunit.Sdk.TestAssemblyRunner`1.RunAsync() in /_/src/xunit.execution/Sdk/Frameworks/Runners/TestAssemblyRunner.cs:line 220
   at Xunit.DependencyInjection.DependencyInjectionTestFrameworkExecutor.RunTestCases(IEnumerable`1 testCases, IMessageSink executionMessageSink, ITestFrameworkExecutionOptions executionOptions)
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

Results File: D:\PePoAgent-1\_temp\Safetica.Customer.Endpoint.Functions.IntegrationTests.trx

Test Run Aborted with error System.Exception: One or more errors occurred.
 ---> System.Exception: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Exception: An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.NetworkStream.Read(Span`1 buffer)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Read(Span`1 buffer)
   at System.Net.Sockets.NetworkStream.ReadByte()
   at System.IO.BinaryReader.Read7BitEncodedInt()
   at System.IO.BinaryReader.ReadString()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.LengthPrefixCommunicationChannel.NotifyDataAvailable()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.TcpClientExtensions.MessageLoopAsync(TcpClient client, ICommunicationChannel channel, Action`1 errorHandler, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---.
...
bradwilson commented 10 months ago

@PetrPospisil What version of .NET (or .NET Framework) are you using?

bradwilson commented 10 months ago

@fredrikcarlbom The only change between 2.5.4 and 2.5.5 was supposedly a fix for a race condition, so it should hopefully be easy to track down. 🤞🏼

fredrikcarlbom commented 10 months ago

@fredrikcarlbom The only change between 2.5.4 and 2.5.5 was supposedly a fix for a race condition, so it should hopefully be easy to track down. 🤞🏼

We have found some bugs in our code that was the cause of most of these exceptions, but I'm don't understand the testrunner code well enough to understand how the exception was so frequent on our slower CI machines and so infrequent on our developer machines.

PetrPospisil commented 10 months ago

xUnit.net VSTest Adapter v2.5.5+28b46ff510 (64-bit .NET 6.0.16) Is this what you need?

Btw. We are currently evaluating 2.5.4 version to be sure. I'll post results once we are sure it is ok or not.

felpel commented 10 months ago

this behavior also happens while targeting .NET 7 (looks similar to @PetrPospisil besides the .NET version):

xUnit.net VSTest Adapter v2.5.5+28b46ff510 (64-bit .NET 7.0.14)

bradwilson commented 10 months ago

Is this what you need?

Yep, thanks!

bradwilson commented 10 months ago

I'm don't understand the testrunner code well enough to understand how the exception was so frequent on our slower CI machines and so infrequent on our developer machines.

I believe it's a (different) race condition.

The 10k foot version of the reason is this: execution and runner live "separate lives", and the messages sent from the execution engine to the runner are usually sent asynchronously. So the likely candidate here (given the "already disposed" message) is that the race condition is happening because the execution engine is shutting down before the runner has a chance to process the message. On a faster machine these are more likely to happen simultaneously.

bradwilson commented 10 months ago

@fredrikcarlbom Are you using any third party extensions to xUnit.net? I noticed @PetrPospisil is using Xunit.DependencyInjection. I'm wondering if this can be reproduced without third party extensions, or whether the extensions are contributing to the issue?

bradwilson commented 10 months ago

I believe I've found the issue and it's unrelated to third party code. The stack trace was very helpful. 🎉

@fredrikcarlbom @PetrPospisil Please try xunit.runner.visualstudio version 2.5.6-pre.4. Thanks! https://xunit.net/docs/using-ci-builds

Commit: https://github.com/xunit/xunit/commit/0eb76d2b780819865a3291d24d4eced6ed744002

fredrikcarlbom commented 10 months ago

@fredrikcarlbom @PetrPospisil Please try xunit.runner.visualstudio version 2.5.6-pre.4. Thanks! https://xunit.net/docs/using-ci-builds

@bradwilson The new version removes all ObjectDisposedExceptions for us, thank you!

PetrPospisil commented 10 months ago

We are currently testing 2.5.6-pre.4. I will have the results probably tomorrow once the enough number of pipeline runs will use it.

PetrPospisil commented 10 months ago

No errors. 2.5.6-pre.4 seems to be working.