Open dmitry-ten opened 3 years ago
Please let me know if you need analysis for these crashes (I am going to need system core if it is a case)
@dmitripivkine thank you, your help would be appreciated. This is the stack trace of the segfault:
#12 <signal handler called>
#13 MM_ForwardedHeader::readClassSlot (
destinationObjectPtr=0x740065006e0028, this=0x7f5ca0bcc3c0)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/structs/ForwardedHeader.hpp:224
#14 MM_ForwardedHeader::copyOrWaitOutline (
this=this@entry=0x7f5ca0bcc3c0,
destinationObjectPtr=destinationObjectPtr@entry=0x740065006e0028)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/structs/ForwardedHeader.cpp:250
#15 0x00007f5cf7cbd145 in MM_ForwardedHeader::copyOrWait (
destinationObjectPtr=0x740065006e0028, this=0x7f5ca0bcc3c0)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/structs/ForwardedHeader.hpp:412
#16 MM_Scavenger::copyAndForward (objectPtrIndirect=0x522f80,
env=0x7f5c7c001cc8, this=0x7f5cf8083200)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/base/standard/Scavenger.cpp:1405
---Type <return> to continue, or q <return> to quit---
#17 MM_Scavenger::copyAndForwardThreadSlot (this=0x7f5cf8083200,
env=env@entry=0x7f5c7c001cc8,
objectPtrIndirect=objectPtrIndirect@entry=0x522f80)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/base/standard/Scavenger.cpp:3175
#18 0x00007f5cf7cc662f in MM_ScavengerRootScanner::doStackSlot (
this=0x7f5ca0bccad0, slotPtr=0x522f80, walkState=<optimized out>,
stackLocation=0x522f80)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_glue_java/ScavengerRootScanner.hpp:105
#19 0x00007f5cfcddc563 in walkJITFrameSlots (
walkState=walkState@entry=0x7f5ca0bcc6d0,
jitDescriptionBits=jitDescriptionBits@entry=0x7f5ca0bcc58e "\v",
stackAllocMapBits=stackAllocMapBits@entry=0x7f5ca0bcc58f "",
jitDescriptionCursor=jitDescriptionCursor@entry=0x7f5ca0bcc590,
stackAllocMapCursor=stackAllocMapCursor@entry=0x7f5ca0bcc598,
jitBitsRemaining=jitBitsRemaining@entry=0x7f5ca0bcc5a0,
mapBytesRemaining=0x7f5ca0bcc5a8, scanCursor=0x522f80,
slotsRemaining=15, stackMap=0x7f5ca137e59c,
gcStackAtlas=0x7f5ca137e4ac, slotDescription=<optimized out>)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/codert_vm/jswalk.c:646
#20 0x00007f5cfcddc98c in jitWalkFrame (
walkState=walkState@entry=0x7f5ca0bcc6d0,
walkLocals=walkLocals@entry=1, stackMap=0x7f5ca137e59c)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/codert_vm/jswalk.c:577
#21 0x00007f5cfcdddc4c in jitWalkStackFrames (walkState=0x7f5ca0bcc6d0)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/codert_vm/jswalk.c:243
#22 0x00007f5cfef1453e in walkStackFrames (currentThread=0x1a4200,
walkState=0x7f5ca0bcc6d0)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/vm/swalk.c:336
#23 0x00007f5cf7b833f6 in GC_VMThreadStackSlotIterator::scanSlots (
vmThread=<optimized out>, walkThread=walkThread@entry=0x513b00,
userData=userData@entry=0x7f5ca0bcc9e0,
oSlotIterator=oSlotIterator@entry=0x7f5cf7b7b7c0 <stackSlotIterator(J9JavaVM*, J9Object**, void*, J9StackWalkState*, void const*)>,
includeStackFrameClassReferences=<optimized out>,
---Type <return> to continue, or q <return> to quit---
h=<optimized out>) at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_structs/VMThreadStackSlotIterator.cpp:114
#24 0x00007f5cf7b7b2cd in MM_RootScanner::scanOneThread (this=0x7f5ca0bccad0, env=0x7f5c7c001cc8, walkThread=0x513b00, localData=0x7f5ca0bcc9e0)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_base/RootScanner.cpp:519
#25 0x00007f5cf7b79fdf in MM_RootScanner::scanThreads (this=0x7f5ca0bccad0, env=0x7f5c7c001cc8)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_base/RootScanner.cpp:488
#26 0x00007f5cf7b7cb52 in MM_RootScanner::scanRoots (this=0x7f5ca0bccad0, env=0x7f5c7c001cc8)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_base/RootScanner.cpp:919
#27 0x00007f5cf7cbf78b in MM_ScavengerRootScanner::scanRoots (env=0x7f5c7c001cc8, this=0x7f5ca0bccad0)
at /root/src-11/openj9-openjdk-jdk11/openj9/runtime/gc_glue_java/ScavengerRootScanner.hpp:200
#28 MM_Scavenger::workThreadGarbageCollect (this=0x7f5cf8083200, env=0x7f5c7c001cc8)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/base/standard/Scavenger.cpp:2572
#29 0x00007f5cf7c74457 in MM_ParallelDispatcher::workerEntryPoint (this=0x7f5cf8049030, env=0x7f5c7c001cc8)
at /root/src-11/openj9-openjdk-jdk11/omr/gc/base/ParallelDispatcher.cpp:186
So far the only thing I can tell is that destinationObjectPtr
contains invalid address and causes the crash.
I'll send you the core dump and jdk I used in DMs.
The reason for crash is bad (stall?) O-Slot: : t14[0x0000000000522F80] = 0x00000007FFF80000
<513b00> JIT frame: bp = 0x0000000000522FF8, pc = 0x00007F5CD9F388CB, unwindSP = 0x0000000000522F10, cp = 0x00000000004E10D0, arg0EA = 0x0000000000523010, jitInfo = 0x00007F5CA137DB38
<513b00> Method: net/adoptopenjdk/test/classloading/ClassMapHog.addClass(Ljava/lang/String;Ljava/lang/Class;)Ljava/util/Map; !j9method 0x00000000004E1C40
<513b00> Bytecode index = 65, inlineDepth = 0, PC offset = 0x0000000000000AA3
<513b00> stackMap=0x00007F5CA137E59C, slots=I16(0x0003) parmBaseOffset=I16(0x0008), parmSlots=U16(0x0003), localBaseOffset=I16(0xFF88)
<513b00> Described JIT args starting at 0x0000000000523000 for U16(0x0003) slots
<513b00> O-Slot: : a2[0x0000000000523000] = 0x0000000705AF0A38
<513b00> O-Slot: : a1[0x0000000000523008] = 0x00000007E8352798
<513b00> O-Slot: : a0[0x0000000000523010] = 0x00000007E83527A8
<513b00> Described JIT temps starting at 0x0000000000522F80 for IDATA(0x000000000000000F) slots
<513b00> O-Slot: : t14[0x0000000000522F80] = 0x00000007FFF80000 <--------
<513b00> O-Slot: : t13[0x0000000000522F88] = 0x00000007FFF89D00
<513b00> I-Slot: : t12[0x0000000000522F90] = 0x00000007FFEC3080
<513b00> O-Slot: : t11[0x0000000000522F98] = 0x00000007FAF16178
<513b00> I-Slot: : t10[0x0000000000522FA0] = 0x0000000000000001
<513b00> I-Slot: : t9[0x0000000000522FA8] = 0x00000007FFF896D0
<513b00> I-Slot: : t8[0x0000000000522FB0] = 0x0000000000000001
<513b00> I-Slot: : t7[0x0000000000522FB8] = 0x00000007FFF896D0
<513b00> I-Slot: : t6[0x0000000000522FC0] = 0x000000000001A99C
<513b00> I-Slot: : t5[0x0000000000522FC8] = 0x00000007FEC041F8
<513b00> I-Slot: : t4[0x0000000000522FD0] = 0x00000007FFEC30A8
<513b00> I-Slot: : t3[0x0000000000522FD8] = 0x00000007FFF89AD0
<513b00> I-Slot: : t2[0x0000000000522FE0] = 0x00000007FEC06B78
<513b00> I-Slot: : t1[0x0000000000522FE8] = 0x00000007058BDC40
<513b00> I-Slot: : t0[0x0000000000522FF0] = 0x000000070598FAE8
<513b00> JIT-RegisterMap = UDATA(0x0000000000000002)
<513b00> JIT-RegisterMap-O-Slot[0x0000000000522EC8] = 0x00000007FFF89DC0 (jit_rbx)
<513b00> JIT-RegisterMap-I-Slot[0x0000000000522ED0] = UDATA(0x00000007FFF89C00) (jit_r9)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E49F0] = UDATA(0x0000000000000000) (jit_r10)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E49F8] = UDATA(0x00000007FFF89C10) (jit_r11)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E4A00] = UDATA(0x00000007E8352760) (jit_r12)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E4A08] = UDATA(0x00000007FFF89E00) (jit_r13)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E4A10] = UDATA(0x00000007E8352788) (jit_r14)
<513b00> JIT-RegisterMap-I-Slot[0x00007F5CA00E4A18] = UDATA(0x00000007FFF89DE8) (jit_r15)
<513b00> JIT-Frame-RegisterMap[0x0000000000522F50] = UDATA(0x0000000000000001) (jit_rbx)
<513b00> JIT-Frame-RegisterMap[0x0000000000522F58] = UDATA(0x00000007FFF896D0) (jit_r9)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E49F0] = UDATA(0x0000000000000000) (jit_r10)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E49F8] = UDATA(0x00000007FFF89C10) (jit_r11)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E4A00] = UDATA(0x00000007E8352760) (jit_r12)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E4A08] = UDATA(0x00000007FFF89E00) (jit_r13)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E4A10] = UDATA(0x00000007E8352788) (jit_r14)
<513b00> JIT-Frame-RegisterMap[0x00007F5CA00E4A18] = UDATA(0x00000007FFF89DE8) (jit_r15)
This 0-slot points inside object !j9object 0x7FFF7FFB8
0x7FFF7FFB0 : 00000000 00000000 0004a200 0000008c [ ................ ] <--- object start
0x7FFF7FFC0 : fff7ffc8 00000007 00750070 006c0062 [ ........p.u.b.l. ]
0x7FFF7FFD0 : 00630069 00730020 00610074 00690074 [ i.c. .s.t.a.t.i. ]
0x7FFF7FFE0 : 00200063 00690066 0061006e 0020006c [ c. .f.i.n.a.l. . ]
0x7FFF7FFF0 : 006e0069 00200074 0061006a 00610076 [ i.n.t. .j.a.v.a. ]
0x7FFF80000 : 006e002e 00740065 0048002e 00740074 [ ..n.e.t...H.t.t. ] <--- mid object pointer
0x7FFF80010 : 00550070 004c0052 006f0043 006e006e [ p.U.R.L.C.o.n.n. ]
0x7FFF80020 : 00630065 00690074 006e006f 0000002e [ e.c.t.i.o.n..... ]
0x7FFF80030 : 00000000 00000000 00000000 00000000 [ ................ ]
0x7FFF80040 : 00000000 00000000 00000000 00000000 [ ................ ]
BTW it is easy to find problematic slot using gccheck
in jdmpview
:
> !gccheck all,noobjectheap:all:midscavenge,quiet
Starting GC Check
Checking CLASS HEAP...done (1799 ms).
Checking REMEMBERED SET...done (57 ms).
Checking UNFINALIZED...done (7 ms).
Checking FINALIZABLE...done (3 ms).
Checking OWNABLE_SYNCHRONIZER...done (2 ms).
Checking STRING TABLE...done (1384 ms).
Checking CLASS LOADERS...done (5 ms).
Checking JNI GLOBAL REFS...done (18 ms).
Checking JNI WEAK GLOBAL REFS...done (1 ms).
Checking JVMTI OBJECT TAG TABLES...done (4 ms).
Checking VM CLASS SLOTS...done (0 ms).
Checking MONITOR TABLE...done (9 ms).
Checking VM THREAD SLOTS...done (386 ms).
Checking THREAD STACKS... <gc check (1): from debugger: THREAD STACKS: slot 513b00(522f80) -> 7fff80000: not in an object segment>
done (384 ms).
Done (4113ms)
FYI this failure looks very similar https://github.com/eclipse-openj9/openj9/issues/10984#issuecomment-870044385
Ah, it is even the same problematic method net/adoptopenjdk/test/classloading/ClassMapHog.addClass
for both cases
Link to the grinder: https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/16570/consoleText The test fails on
x86-64
JDK11, other platforms have not been grinded to determine whether the crash is present there. Failure rate is 3/358. Output from the crashed test:In another failed iteration the output is different:
In both cases the failure is inside GC code.