AArnott / Library.Template

A template for a NuGet package with tests, stylecop, fxcop, versioning, and Azure Pipelines build ready to go.
MIT License
131 stars 26 forks source link

Hang dumps not uploaded on Azure pipelines for MacOS #167

Closed SteveBush closed 2 years ago

SteveBush commented 2 years ago

Hang dumps are not being uploaded on MacOS Azure pipeline agents. GitHub Actions work fine. When I open a TRX, the following dialog shows the relative path for a macOS hang dump (ignore the downloads folder prefix).

image

AArnott commented 2 years ago

Hang dumps aren't even created on macOS agents, by the look of it. Errors about the platform not supporting it. I wonder what GitHub Actions could possibly do differently since this appears to be a limitation in the dotnet test tooling.

Dmp files are created on linux and Windows, but I don't see them included in any published artifact. I know I had at least that working in the past. I'll have to investigate what happened and get that fixed.

SteveBush commented 2 years ago

I can verify that dmp files are created on the GitHub Action macOS agent when the testhost is forcibly closed due to a timeout. I had a deadlock in my code. Interestingly, if you look at build log, Blame says the platform is not supported yet a hangdump is being created.

Starting test execution, please wait...
Logging Vstest Diagnostics in file: /Users/runner/work/_temp/_artifacts/test_logs/diag.log
A total of 1 test files matched the specified pattern.
[14:31:44 INF] This is a test
[14:31:44 INF] This is a test
Data collector 'Blame' message: Could not start process dump: System.PlatformNotSupportedException: Unsupported operating system: Darwin 21.5.0 Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:22 PDT 2022; root:xnu-8020.121.3~4/RELEASE_X86_[64](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:65), and framework: .NETCoreApp,Version=v3.1.
   at Microsoft.TestPlatform.Extensions.BlameDataCollector.CrashDumperFactory.Create(String targetFramework)
   at Microsoft.TestPlatform.Extensions.BlameDataCollector.ProcessDumpUtility.CrashDump(Int32 processId, String tempDirectory, DumpTypeOption dumpType, String targetFramework, Boolean collectAlways)
   at Microsoft.TestPlatform.Extensions.BlameDataCollector.ProcessDumpUtility.StartTriggerBasedProcessDump(Int32 processId, String testResultsDirectory, Boolean isFullDump, String targetFramework, Boolean collectAlways)
   at Microsoft.TestPlatform.Extensions.BlameDataCollector.BlameCollector.TestHostLaunchedHandler(Object sender, TestHostLaunchedEventArgs args).
Data collector 'Blame' message: All tests finished running, Sequence file will not be generated.
Results File: /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.UnitTests/TestResults/_Mac-1[65](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:66)5733505463_2022-06-20_14_31_43.trx
Attachments:
  /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.UnitTests/TestResults/dd21cb3e-0070-4786-b5af-5655b3937c2f/runner_Mac-1655733505463_2022-06-20.14_31_34.cobertura.xml
Passed!  - Failed:     0, Passed:   957, Skipped:     0, Total:   957, Duration: 7 s - /Users/runner/work/Platform/Platform/bin/NetworkVisor.Platform.Test.MacOS.UnitTests/Release-MacOS/netcoreapp3.1/NetworkVisor.Platform.Test.MacOS.UnitTests.dll (netcoreapp3.1)
Blame: Dumping 3[67](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:68)8 - dotnet
[createdump] Gathering state for process 3678 
[createdump] Writing full dump to file /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/b0563cf8-4535-4e60-bf2e-6afa5b541345/dotnet_3678_20220620T143324_hangdump.dmp
[createdump] Written 3935113[68](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:69)8 bytes (960[72](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:73)1 pages) to core file
[createdump] Dump successfully written
The active test run was aborted. Reason: Test host process crashed
Data collector 'Blame' message: The specified inactivity time of 2 minutes has elapsed. Collecting hang dumps from testhost and its child processes.
Results File: /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.IntegrationTests/TestResults/_Mac-1655[73](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:74)3505463_2022-06-20_14_31_12.trx
The active Test Run was aborted because the host process exited unexpectedly. Please inspect the call stack above, if available, to get more information about where the exception originated from.
The test running when the crash occurred: 
NetworkVisor.Platform.Test.Shared.IntegrationTests.Networking.CoreNetworkInterfaceIntegrationTests.NetworkInteface_OutputAllNetworkInterfaces
NetworkVisor.Platform.Test.Shared.IntegrationTests.CoreSystem.CoreFileSystemIntegrationTests.FileSystem_ReadFileContentsAsync_AppSettings
NetworkVisor.Platform.Test.Shared.MacOS.IntegrationTests.SystemProfiler.MacOSSystemProfilerIntegrationTests.MacOSSystemProfilerCache_HardwareDataType
NetworkVisor.Platform.Test.Shared.IntegrationTests.Networking.Arp.CoreNetworkArpIntegrationTests.ClientNetworkArpIntegrationTests_GetNetworkCachedArpDevices
NetworkVisor.Platform.Test.Shared.IntegrationTests.Networking.CoreNetworkingSystemBaseIntegrationTests.NetworkingSystemBase_GetAllCoreNetworkInterfaces
This test may, or may not be the source of the crash.
Attachments:
  /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.IntegrationTests/TestResults/eddbada2-c2df-4255-97ca-[77](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:78)b779fc064c/dotnet_36[78](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:79)_20220620T143324_hangdump.dmp
  /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.IntegrationTests/TestResults/eddbada2-c2df-4255-97ca-77b7[79](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:80)fc064c/Sequence_6120728d06494e1e90e732e54dcb30e4.xml
  /Users/runner/work/Platform/Platform/test/macos/NetworkVisor.Platform.Test.MacOS.IntegrationTests/TestResults/eddbada2-c2df-4255-97ca-77b779fc064c/runner_Mac-1655733505463_2022-06-20.14_31_05.cobertura.xml
Passed!  - Failed:     0, Passed:    [81](https://github.com/NetworkVisor/Platform/runs/6968799812?check_suite_focus=true#step:12:82), Skipped:     0, Total:    81, Duration: 12 s - /Users/runner/work/Platform/Platform/bin/NetworkVisor.Platform.Test.MacOS.IntegrationTests/Release-MacOS/net6.0/NetworkVisor.Platform.Test.MacOS.IntegrationTests.dll (net6.0)
Test Run Aborted with error System.Exception: One or more errors occurred.
 ---> System.Exception: Unable to read beyond the end of the stream.
   at System.IO.BinaryReader.Read7BitEncodedInt()
   at System.IO.BinaryReader.ReadString()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.LengthPrefixCommunicationChannel.NotifyDataAvailable()
   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.TcpClientExtensions.MessageLoopAsync(TcpClient client, ICommunicationChannel channel, Action`1 errorHandler, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---.
AArnott commented 2 years ago

Thanks for that. Interestingly, once I fixed the dmp collection problem that applied to all agents, and extended the mock test hang from 60s to 120s, I got dumps for all three OSs in Azure Pipelines. GitHub worked all along I think because it doesn't set an AzP env var that was taking my script down a particular path that no longer applies. I'll push the fix shortly.