Closed jkotas closed 3 months ago
This is crashing in new EH:
0b 00000032`5797bd90 00007ffa`cd1a213c ntdll!RtlDispatchException+0x200 [minkernel\ntos\rtl\arm64\exdsptch.c @ 720]
0c 00000032`5797c3d0 00007ffa`c8ef8274 ntdll!RtlRaiseException+0xbc [minkernel\ntos\rtl\arm64\raise.c @ 123]
0d 00000032`5797c8b0 00007ffa`756d4be8 KERNELBASE!RaiseException+0x54 [minkernel\kernelbase\xcpt.c @ 936]
0e 00000032`5797c960 00007ffa`742f2160 coreclr!SfiNext+0x308 [D:\a\_work\1\s\src\coreclr\vm\exceptionhandling.cpp @ 8432]
0f 00000032`5797cad0 00007ffa`742f1678 System_Private_CoreLib!System.Runtime.EH.DispatchEx(System.Runtime.StackFrameIterator ByRef, ExInfo ByRef)+0x6a0 [/_/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 861]
10 00000032`5797cc30 00007ffa`753b1e54 System_Private_CoreLib!System.Runtime.EH.RhThrowEx(System.Object, ExInfo ByRef)+0x68 [/_/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 642]
11 00000032`5797cc50 00007ffa`7552f9c8 coreclr!CallDescrWorkerInternal+0x84 [D:\a\_work\1\s\artifacts\obj\coreclr\windows.arm64.Checked\vm\wks\CallDescrWorkerARM64.asm @ 4989]
12 00000032`5797cc70 00007ffa`755301e0 coreclr!CallDescrWorkerWithHandler+0x128 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 67]
13 00000032`5797ccc0 00007ffa`756c9280 coreclr!DispatchCallSimple+0xe0 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 221]
14 00000032`5797cda0 00007ffa`756c9320 coreclr!DispatchManagedException+0x1e0 [D:\a\_work\1\s\src\coreclr\vm\exceptionhandling.cpp @ 5645]
15 00000032`5797e0a0 00007ffa`755ba154 coreclr!DispatchManagedException+0x40 [D:\a\_work\1\s\src\coreclr\vm\exceptionhandling.cpp @ 5662]
16 00000032`5797e460 00007ffa`755abb14 coreclr!ThrowNew+0xe4 [D:\a\_work\1\s\src\coreclr\vm\jithelpers.cpp @ 4229]
17 00000032`5797e4c0 00007ffa`15f0e534 coreclr!IL_Throw+0xb4 [D:\a\_work\1\s\src\coreclr\vm\jithelpers.cpp @ 4259]
@janvorli has this been fixed?
@lewing I believe I have looked into the original and it was actually not an issue of the new EH, but rather a regular unhandled exception or something along these lines. Unfortunately, the only two remaining failures from the list above are misclassified, so I cannot double check it now.
Either way, CI does not hit this anymore, so the issue can be closed.
@jkotas the failure in the referenced run is just an unhandled exception. From the test log:
Output:
Unhandled exception. System.Exception: Profiler tests are expected to contain the text 'PROFILER TEST PASSES' in the console output of the profilee app to indicate a passing test. Usually it is printed from the Shutdown() method of the profiler implementation. This text was not found in the output above.
at Profiler.Tests.ProfilerTestRunner.FailFastWithMessage(String error)
at Profiler.Tests.ProfilerTestRunner.Run(String profileePath, String testName, Guid profilerClsid, String profileeArguments, ProfileeOptions profileeOptions, Dictionary`2 envVars, String reverseServerName, Boolean loadAsNotification, Int32 notificationCopies)
at Profiler.Tests.MultiplyLoaded.Main(String[] args)
The last message from the actual test is
Waiting for profilers to all detach
So that sounds like profiler or test related issue.
Ah ok, I have not realized that this is the new callstack for unhandled exception. I agree that it looks like a profiler test issue. We would need a crash dump of the hung process to diagnose it.
Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.
Based on the dump, the child process exit code is 0:
0:000> !do 000001f8`7080afc8
Name: System.Diagnostics.Process
MethodTable: 00007ffbf63dca00
...
00007ffbf5c4f900 40000dd d8 System.Int32 1 instance 0 _exitCode
...
It suggests that the child process exited without crashing. The problem seems to be in the parsing of the child process output.
@jkotas do we expect this issue to be fixed by https://github.com/dotnet/runtime/pull/101607?
Missing volatile
that I have added in https://github.com/dotnet/runtime/pull/101607 can explain failures with these symptoms. I suspect that my change is not going to fix the failures. The test has been failing too frequently for missing volatile
to be the root cause of the failures.
I would keep this issue open for 2 weeks. If we do not see any new hits in 2 weeks, we can declare it fixed.
Failed in: runtime-coreclr r2r 20240429.1
Failed tests:
R2R-CG2 windows arm64 Checked no_tiered_compilation @ Windows.11.Arm64.Open
- profiler\\multiple\\multiple\\multiple.cmd
Error message:
Unhandled exception. System.Exception: Profiler tests are expected to contain the text 'PROFILER TEST PASSES' in the console output of the profilee app to indicate a passing test. Usually it is printed from the Shutdown() method of the profiler implementation. This text was not found in the output above. Profilee returned exit code 100.
at Profiler.Tests.ProfilerTestRunner.FailFastWithMessage(String error)
at Profiler.Tests.ProfilerTestRunner.Run(String profileePath, String testName, Guid profilerClsid, String profileeArguments, ProfileeOptions profileeOptions, Dictionary`2 envVars, String reverseServerName, Boolean loadAsNotification, Int32 notificationCopies)
at Profiler.Tests.MultiplyLoaded.Main(String[] args)
Return code: 1
Raw output file: C:\h\w\C29B0A72\w\AB5E08EF\uploads\Reports\profiler.multiple\multiple\multiple.output.txt
Raw output:
BEGIN EXECUTION
eventpipe_common.dll
Microsoft.Diagnostics.NETCore.Client.dll
multiple.dll
Profiler.dll
profiler_common.dll
TestLibrary.dll
6 file(s) copied.
6:55:01.56
Response file: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll.rsp
C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\IL-CG2\eventpipe_common.dll
-o:C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll
--targetarch:arm64
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:\h\w\C29B0A72\p\System.*.dll
-r:C:\h\w\C29B0A72\p\Microsoft.*.dll
-r:C:\h\w\C29B0A72\p\xunit.*.dll
-r:C:\h\w\C29B0A72\p\mscorlib.dll
-r:C:\h\w\C29B0A72\p\netstandard.dll
" "dotnet" "C:\h\w\C29B0A72\p\crossgen2\crossgen2.dll" @"C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll.rsp" -r:C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\IL-CG2\*.dll"
Emitting R2R PE file: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll
" "dotnet" "C:\h\w\C29B0A72\p\r2rdump\r2rdump.dll" --header --sc --in C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll --out C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\eventpipe_common.dll.r2rdump --val"
6:55:06.92
6:55:06.92
Response file: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll.rsp
C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\IL-CG2\Microsoft.Diagnostics.NETCore.Client.dll
-o:C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll
--targetarch:arm64
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:\h\w\C29B0A72\p\System.*.dll
-r:C:\h\w\C29B0A72\p\Microsoft.*.dll
-r:C:\h\w\C29B0A72\p\xunit.*.dll
-r:C:\h\w\C29B0A72\p\mscorlib.dll
-r:C:\h\w\C29B0A72\p\netstandard.dll
" "dotnet" "C:\h\w\C29B0A72\p\crossgen2\crossgen2.dll" @"C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll.rsp" -r:C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\IL-CG2\*.dll"
Emitting R2R PE file: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll
" "dotnet" "C:\h\w\C29B0A72\p\r2rdump\r2rdump.dll" --header --sc --in C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll --out C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Microsoft.Diagnostics.NETCore.Client.dll.r2rdump --val"
6:55:10.09
6:55:10.11
Response file: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\multiple.dll.rsp
C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\IL-CG2\multiple.dll
-o:C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\multiple.dll
--targetarch:arm64
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:\h\w\C29B0A72\p\System.*.dll
-r:C:\h\w\C29B0A72\p\Microsoft.*.dll
-r:C:\h\w\C29B0A72\p\xunit.*.dll
-r:C:\h\w\C29B0A72\p\mscorlib.dll
-r:C:\h\w\C29B0A72\p\netstandard.dll
" "dotnet" "C:\h\w\C2
Stack trace:
at profiler_multiple._multiple_multiple_._multiple_multiple_cmd()
at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)
The latest failure doesn't make sense to me
Profiler path: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Profiler.dll
Profiler path: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Profiler.dll
Profiler.dll!DllGetClassObject
MultiplyLoaded::Initialize
Profiler.dll!Profiler::Initialize
Profiler.dll!DllGetClassObject
MultiplyLoaded::Initialize
Profiler.dll!Profiler::Initialize
Profiler path: C:\h\w\C29B0A72\w\AB5E08EF\e\profiler\multiple\multiple\Profiler.dll
Profiler.dll!DllGetClassObject
MultiplyLoaded::InitializeForAttach
Profiler.dll!Profiler::Initialize
Throwing exception
Exception caught
Waiting for profilers to all detach
Test failed: Profiler tests are expected to contain the text 'PROFILER TEST PASSES' in the console output of the profilee app to indicate a passing test. Usually it is printed from the Shutdown() method of the profiler implementation. This text was not found in the output above. Profilee returned exit code 100.
Expected: 100
Actual: -532462766
END EXECUTION - FAILED
FAILED
Test failed. Trying to see if dump file was created in C:\cores since 4/30/2024 6:55:01 AM
The profilee returned code 100 which means that the _profilerDone.WaitOne
call succeeded, but we should only set the event after seeing all three profilers successfully detach. There are no logs indicating any profiler attached however. My only thought is that somehow we are just missing output from the profiled process, but we call Process.WaitForExit()
which says in the docs that all IO is guaranteed to occur before it returns.
A good question to ask is why this test and why arm64 only. It all points to bug(s) related to arm64 weaker memory model in the test code or in the code specifically exercised by this test.
The missing volatile
that I have added in https://github.com/dotnet/runtime/pull/101607/files#diff-667a604f90dafe4391d86c87780a736aa2eb10010f194966110baeb141d05370R198 was a bug like that. My guess is that we have more of them.
I fixed another issue and added a speculative fix in #102066, for now I am going to wait to see if it fails again before investigating any further
There have been no issues reported by automation since this issue has been fixed, closing as "fixed"
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=655183 Build error leg or test failing: profiler\multiple\multiple\multiple.cmd Pull request: https://github.com/dotnet/runtime/pull/101291
Error Message
Fill the error message using step by step known issues guidance.
Known issue validation
Build: :mag_right: Result validation: :warning: Validation could not be done without an Azure DevOps build URL on the issue. Please add it to the "Build: :mag_right:" line. Validation performed at: 4/25/2024 9:12:24 PM UTC
Report
Summary