Open pshipton opened 2 years ago
@dmitripivkine we're having some issues on Windows because the machines haven't been restarted in a while. There may be processes left running, or low swap space. However crashing the GC and vmState=0xffffffff
seems weird.
At the moment of crash MM_GlobalCollector
and it's child MM_GlobalCollectorDelegate
are NULL (teared down already).
1 Id: 2790.3094 Suspend: 0 Teb: 00007ff7`fdc0b000 Unfrozen
Child-SP RetAddr Call Site
000000fc`1d758668 00007ff8`996b1118 ntdll!ZwWaitForSingleObject+0xa
000000fc`1d758670 00007ff8`8bc38ec0 KERNELBASE!WaitForSingleObjectEx+0x98
000000fc`1d758710 00007ff8`8a845603 j9prt29!omrdump_create+0x300 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win32\omrosdump.c @ 185]
000000fc`1d7587b0 00007ff8`8a8450f5 j9dmp29!doSystemDump+0xa3 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 747]
000000fc`1d758810 00007ff8`8bc3b4d6 j9dmp29!protectedDumpFunction+0x15 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 2843]
000000fc`1d758840 00007ff8`8bc3cc14 j9prt29!runInTryExcept+0x16 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 220]
000000fc`1d758880 00007ff8`8a843094 j9prt29!omrsig_protect+0x214 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 285]
000000fc`1d758a60 00007ff8`8a859e5b j9dmp29!runDumpAgent+0x2f4 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\dmpagent.c @ 2751]
000000fc`1d758f40 00007ff8`81c4e02f j9dmp29!triggerDumpAgents+0x34b [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\rasdump\trigger.c @ 1013]
000000fc`1d7592a0 00007ff8`8bc3b4d6 j9vm29!generateDiagnosticFiles+0x1df [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 1163]
000000fc`1d759760 00007ff8`8bc3cc14 j9prt29!runInTryExcept+0x16 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 220]
000000fc`1d7597a0 00007ff8`81c4e232 j9prt29!omrsig_protect+0x214 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 285]
000000fc`1d759980 00007ff8`81c4d1ea j9vm29!vmSignalHandler+0x1d2 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 839]
000000fc`1d75a870 00007ff8`8bc3aba4 j9vm29!structuredSignalHandlerVM+0x7a [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\vm\gphandle.c @ 603]
000000fc`1d75a8a0 00007ff8`9c465bc2 j9prt29!mainVectoredExceptionHandler+0x154 [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\port\win64amd\omrsignal.c @ 1207]
000000fc`1d75aa60 00007ff8`9c464413 ntdll!RtlRestoreContext+0x182
000000fc`1d75aaf0 00007ff8`9c4a20ba ntdll!RtlRaiseException+0xe33
000000fc`1d75b1c0 00007ff8`7ecc07bd ntdll!KiUserExceptionDispatcher+0x3a
000000fc`1d75b8e0 00007ff8`7ed034ce j9gc29!MM_GlobalCollectorDelegate::tearDown+0xd [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\openj9\runtime\gc_glue_java\globalcollectordelegate.cpp @ 124]
000000fc`1d75b910 00007ff8`7ed17f70 j9gc29!MM_ParallelGlobalGC::tearDown+0x1e [f:\users\jenkins\workspace\build_jdk11_x86-64_windows_nightly\omr\gc\base\standard\parallelglobalgc.cpp @ 360]
j9gc29!MM_GlobalCollectorDelegate::tearDown:
00007ff8`7ecc07b0 4053 push rbx
00007ff8`7ecc07b2 4883ec20 sub rsp,20h
00007ff8`7ecc07b6 488bd9 mov rbx,rcx
00007ff8`7ecc07b9 488b4908 mov rcx,qword ptr [rcx+8]
00007ff8`7ecc07bd 80b9db58000000 cmp byte ptr [rcx+58DBh],0 <--- crash here at attempt to read _extensions->_isStandardGC
00007ff8`7ecc07c4 7424 je j9gc29!MM_GlobalCollectorDelegate::tearDown+0x3a (00007ff8`7ecc07ea)
00007ff8`7ecc07c6 488b8950670000 mov rcx,qword ptr [rcx+6750h]
00007ff8`7ecc07cd 4885c9 test rcx,rcx
An MM_GlobalCollectorDelegate::_extensions
field is NULL, so this leads to crash
Windbg does not show up the rest of the stack, MM_ParallelGlobalGC::tearDown()
is called from MM_ConcurrentGC::tearDown()
from MM_ConcurrentGCIncrementalUpdate::tearDown()
etc.
I am not sure how we ends up to tear down collector twice(?)
It is not clear how we ends up with this.
System core is not helpful really, GC is shutdown already
An alternative possibility is GC has not been launched (yet). And looks like it is a case (despite it is hard to distinguish from initialized/shutdown scenario). GC Extensions is still be around and there no evidence GC has been executed even once
Also there is no evidence Gagbage Collector initialization fails - gcextensions->heapInitializationFailureReason
is still be set to zero (no error)
https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.openjdk_x86-64_windows_Nightly/230 - win2012-x86-5 jdk_lang_0
-Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:+UseCompressedOops
java/lang/ProcessHandle/OnExitTest.javahttps://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.openjdk_x86-64_windows_Nightly/230/openjdk_test_output.tar.gz