OpenJDK ProcessHandle/OnExitTest crash vmState=0xffffffff

pshipton commented 2 years ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.openjdk_x86-64_windows_Nightly/230 - win2012-x86-5 jdk_lang_0 -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:+UseCompressedOops java/lang/ProcessHandle/OnExitTest.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.openjdk_x86-64_windows_Nightly/230/openjdk_test_output.tar.gz

01:55:07  Type=Segmentation error vmState=0xffffffff
01:55:07  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FF87ECC07BD ContextFlags=0010005f
01:55:07  Handler1=00007FF881C4D170 Handler2=00007FF88BC3AA50 InaccessibleReadAddress=00000000000058DB
01:55:07  RDI=000000FC1D75BA10 RSI=0000000000000000 RAX=000000FC1D20E598 RBX=000000FC1DAA5FD0
01:55:07  RCX=0000000000000000 RDX=000000FC1D75BA10 R8=00007FF87EED5BB0 R9=0000000000008000
01:55:07  R10=00007FF89C4157D0 R11=000000FC1D75B660 R12=000000007FEE0000 R13=0000000000000000
01:55:07  R14=000000FC1DA8DB88 R15=000000FC1DA9A770
01:55:07  RIP=00007FF87ECC07BD RSP=000000FC1D75B8E0 RBP=000000FC1D75BAA0 EFLAGS=0000000000010202
01:55:07  FS=0053 ES=002B DS=002B
01:55:07  XMM0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
01:55:07  Module=F:\Users\jenkins\workspace\Test_openjdk11_j9_sanity.openjdk_x86-64_windows_Nightly\openjdkbinary\j2sdk-image\bin\default\j9gc29.dll
01:55:07  Module_base_address=00007FF87EB80000 Offset_in_DLL=00000000001407bd
01:55:07  Target=2_90_20220306_235 (Windows Server 2012 R2 6.3 build 9600)
01:55:07  CPU=amd64 (8 logical CPUs) (0x1ffb9c000 RAM)

pshipton commented 2 years ago

@dmitripivkine we're having some issues on Windows because the machines haven't been restarted in a while. There may be processes left running, or low swap space. However crashing the GC and vmState=0xffffffff seems weird.

dmitripivkine commented 2 years ago

At the moment of crash MM_GlobalCollector and it's child MM_GlobalCollectorDelegate are NULL (teared down already).

   1  Id: 2790.3094 Suspend: 0 Teb: 00007ff7`fdc0b000 Unfrozen
Child-SP          RetAddr           Call Site
000000fc`1d758668 00007ff8`996b1118 ntdll!ZwWaitForSingleObject+0xa
000000fc`1d758670 00007ff8`8bc38ec0 KERNELBASE!WaitForSingleObjectEx+0x98
000000fc`1d758710 00007ff8`8a845603 j9prt29!omrdump_create+0x300 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win32\omrosdump.c @ 185]
000000fc`1d7587b0 00007ff8`8a8450f5 j9dmp29!doSystemDump+0xa3 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 747]
000000fc`1d758810 00007ff8`8bc3b4d6 j9dmp29!protectedDumpFunction+0x15 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 2843]
000000fc`1d758840 00007ff8`8bc3cc14 j9prt29!runInTryExcept+0x16 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 220]
000000fc`1d758880 00007ff8`8a843094 j9prt29!omrsig_protect+0x214 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 285]
000000fc`1d758a60 00007ff8`8a859e5b j9dmp29!runDumpAgent+0x2f4 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 2751]
000000fc`1d758f40 00007ff8`81c4e02f j9dmp29!triggerDumpAgents+0x34b [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\trigger.c @ 1013]
000000fc`1d7592a0 00007ff8`8bc3b4d6 j9vm29!generateDiagnosticFiles+0x1df [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 1163]
000000fc`1d759760 00007ff8`8bc3cc14 j9prt29!runInTryExcept+0x16 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 220]
000000fc`1d7597a0 00007ff8`81c4e232 j9prt29!omrsig_protect+0x214 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 285]
000000fc`1d759980 00007ff8`81c4d1ea j9vm29!vmSignalHandler+0x1d2 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 839]
000000fc`1d75a870 00007ff8`8bc3aba4 j9vm29!structuredSignalHandlerVM+0x7a [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 603]
000000fc`1d75a8a0 00007ff8`9c465bc2 j9prt29!mainVectoredExceptionHandler+0x154 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 1207]
000000fc`1d75aa60 00007ff8`9c464413 ntdll!RtlRestoreContext+0x182
000000fc`1d75aaf0 00007ff8`9c4a20ba ntdll!RtlRaiseException+0xe33
000000fc`1d75b1c0 00007ff8`7ecc07bd ntdll!KiUserExceptionDispatcher+0x3a
000000fc`1d75b8e0 00007ff8`7ed034ce j9gc29!MM_GlobalCollectorDelegate::tearDown+0xd [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\gc_glue_java\globalcollectordelegate.cpp @ 124]
000000fc`1d75b910 00007ff8`7ed17f70 j9gc29!MM_ParallelGlobalGC::tearDown+0x1e [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\gc\base\standard\parallelglobalgc.cpp @ 360]

j9gc29!MM_GlobalCollectorDelegate::tearDown:
00007ff8`7ecc07b0 4053            push    rbx
00007ff8`7ecc07b2 4883ec20        sub     rsp,20h
00007ff8`7ecc07b6 488bd9          mov     rbx,rcx
00007ff8`7ecc07b9 488b4908        mov     rcx,qword ptr [rcx+8]
00007ff8`7ecc07bd 80b9db58000000  cmp     byte ptr [rcx+58DBh],0 <--- crash here at attempt to read _extensions->_isStandardGC
00007ff8`7ecc07c4 7424            je      j9gc29!MM_GlobalCollectorDelegate::tearDown+0x3a (00007ff8`7ecc07ea)
00007ff8`7ecc07c6 488b8950670000  mov     rcx,qword ptr [rcx+6750h]
00007ff8`7ecc07cd 4885c9          test    rcx,rcx

An MM_GlobalCollectorDelegate::_extensions field is NULL, so this leads to crash Windbg does not show up the rest of the stack, MM_ParallelGlobalGC::tearDown() is called from MM_ConcurrentGC::tearDown() from MM_ConcurrentGCIncrementalUpdate::tearDown() etc. I am not sure how we ends up to tear down collector twice(?) It is not clear how we ends up with this. System core is not helpful really, GC is shutdown already

dmitripivkine commented 2 years ago

An alternative possibility is GC has not been launched (yet). And looks like it is a case (despite it is hard to distinguish from initialized/shutdown scenario). GC Extensions is still be around and there no evidence GC has been executed even once

dmitripivkine commented 2 years ago

Also there is no evidence Gagbage Collector initialization fails - gcextensions->heapInitializationFailureReason is still be set to zero (no error)

eclipse-openj9 / openj9

OpenJDK ProcessHandle/OnExitTest crash vmState=0xffffffff #14665