Open llxia opened 3 years ago
FYI @pshipton @zl-wang
@dmitripivkine fyi
@llxia I am looking to failed job and don't see link to artefacts. Do we have stored results for this failure?
Adding JIT label as well for visibility. This is assertion is Marking, so most likely heap consistency problem
p10aix009.rtp.raleigh.ibm.com is not configured properly so we cannot run the test pipeline. We used it for testing 0.27.1 temporarily as it is the only p10 that we have so far. Artifacts are on the p10aix009. I will try to see if I can upload them into the Artifacory server.
@llxia Thank you, I have files downloaded to /team/Dmitri/13453/
The reason for assertion is a stall pointer in the object:
> !j9object 0x1c2077c28
!J9Object 0x00000001C2077C28 {
struct J9Class* clazz = !j9class 0x30052100 // java/util/HashMap
Object flags = 0x00000010;
I lockword = 0x00000008 (offset = 0) (java/lang/Object) <hidden>
Ljava/util/Set; keySet = !fj9object 0x0 (offset = 4) (java/util/AbstractMap)
Ljava/util/Collection; values = !fj9object 0x0 (offset = 8) (java/util/AbstractMap)
[Ljava/util/HashMap$Node; table = !fj9object 0xffff0e90 (offset = 12) (java/util/HashMap) <--- uncompressed 0x7FFF87480
Ljava/util/Set; entrySet = !fj9object 0x0 (offset = 16) (java/util/HashMap)
I size = 0x00000010 (offset = 20) (java/util/HashMap)
I modCount = 0x00000010 (offset = 24) (java/util/HashMap)
I threshold = 0x00000018 (offset = 28) (java/util/HashMap)
F loadFactor = 0x3F400000 (offset = 32) (java/util/HashMap)
}
This object is alive and marked:
> !markmap ismarked 0x1c2077c28
Object 0x00000001C2077C28 is marked
An address 0x7FFF87480
is located in reserved part of the Nursery and contain garbage:
0x7FFF87480 : 00700061 00720061 006d0023 00310000 [ .p.a.r.a.m.#.1.. ]
@0xdaryl FYI
Finding root path for this object
> !rootpathfindall 0x1c2077c28
========================================
net/adoptopenjdk/test/classloading/ClassMapHog@0x00000001C3040DA0
java/util/HashMap@0x00000001C3041170
java/util/HashMap$Node[]@0x00000001C116CB30 = { 0x0, 0x0, 0x1c3599e40, 0x0, 0x0, 0x1c1219258, 0x0, 0x1c3528620, 0x1c366bc90, 0x0, ... }
java/util/HashMap$Node@0x00000001C2077C00
java/util/HashMap@0x00000001C2077C28
So top object is:
> !j9object 0x00000001C3040DA0
!J9Object 0x00000001C3040DA0 {
struct J9Class* clazz = !j9class 0x30501B00 // net/adoptopenjdk/test/classloading/ClassMapHog
Object flags = 0x00000020;
I lockword = 0x00000008 (offset = 0) (java/lang/Object) <hidden>
J objCount = 0x0000000000000000 (offset = 4) (net/adoptopenjdk/test/classloading/ClassMapHog)
Z serialize = 0x00000000 (offset = 20) (net/adoptopenjdk/test/classloading/ClassMapHog)
Ljava/util/HashMap; hmap = !fj9object 0x3860822e (offset = 12) (net/adoptopenjdk/test/classloading/ClassMapHog)
I loopCount = 0x00000002 (offset = 24) (net/adoptopenjdk/test/classloading/ClassMapHog)
Ljava/lang/String; parmlist = !fj9object 0x38066e09 (offset = 16) (net/adoptopenjdk/test/classloading/ClassMapHog)
}
It is referenced from java stack of the thread 0x30555200
:
> !threads stackslots | grep -i 1C3040DA0
<30555200> JIT-Resolve-RegisterMap[0x0000010028E1F828] = UDATA(0x00000001C3040DA0) (jit_r21)
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F828] = UDATA(0x00000001C3040DA0) (jit_r21)
<30555200> JIT-Frame-RegisterMap[0x0000010028E1F828] = UDATA(0x00000001C3040DA0) (jit_r21)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F828] = 0x00000001C3040DA0 (jit_r21)
Object is referenced from frame:
<30555200> JIT frame: bp = 0x0000000030566BA8, pc = 0x000001001444B270, unwindSP = 0x0000000030566A40, cp = 0x00000000305011D0, arg0EA = 0x0000000030566BB0, jitInfo = 0x0000010024EADC78
<30555200> Method: net/adoptopenjdk/test/classloading/ClassMapHog.run()V !j9method 0x0000000030501D00
<30555200> Bytecode index = 269, inlineDepth = 0, PC offset = 0x00000000000006F4
<30555200> stackMap=0x0000010024EAE4C8, slots=I16(0x0001) parmBaseOffset=I16(0x0008), parmSlots=U16(0x0001), localBaseOffset=I16(0xFF60)
<30555200> Described JIT args starting at 0x0000000030566BB0 for U16(0x0001) slots
<30555200> I-Slot: : a0[0x0000000030566BB0] = 0x00000001C02E54F8
<30555200> Described JIT temps starting at 0x0000000030566B08 for IDATA(0x0000000000000014) slots
<30555200> Address 0x0000010024EAFD17
<30555200> Num internal ptr map bytes U8(0x05)
<30555200> Address 0x0000010024EAFD18
<30555200> Index of first internal ptr I16(0x0013)
<30555200> Address 0x0000010024EAFD1A
<30555200> Offset of first internal ptr I16(0xFFF0)
<30555200> Address 0x0000010024EAFD1C
<30555200> Num distinct pinning arrays U8(0x02)
<30555200> Before object slot walk &address : 0x0000000030566B98 address : 0x00000001C212CF58 bp 0x0000000030566BA8 offset of first internal ptr I16(0xFFF0)
<30555200> After object slot walk for pinning array with &address : 0x0000000030566B98 old address 0x00000001C212CF58 new address 0x00000001C212CF58 displacement IDATA(0x0000000000000000)
<30555200> For pinning array U8(0x00) num internal pointer stack slots U8(0x00)
<30555200> Before object slot walk &address : 0x0000000030566BA0 address : 0x00000001C212CF70 bp 0x0000000030566BA8 offset of first internal ptr I16(0xFFF0)
<30555200> After object slot walk for pinning array with &address : 0x0000000030566BA0 old address 0x00000001C212CF70 new address 0x00000001C212CF70 displacement IDATA(0x0000000000000000)
<30555200> For pinning array U8(0x01) num internal pointer stack slots U8(0x00)
<30555200> I-Slot: : t19[0x0000000030566B08] = 0x00000001C0323100
<30555200> I-Slot: : t18[0x0000000030566B10] = 0x00000001C03230C8
<30555200> I-Slot: : t17[0x0000000030566B18] = 0x000001001454D638
<30555200> I-Slot: : t16[0x0000000030566B20] = 0x00000000C0320DE0
<30555200> I-Slot: : t15[0x0000000030566B28] = 0x0000000900000000
<30555200> I-Slot: : t14[0x0000000030566B30] = 0x00000000300C6F00
<30555200> I-Slot: : t13[0x0000000030566B38] = 0x00000001C03230C8
<30555200> I-Slot: : t12[0x0000000030566B40] = 0x00000001C0323100
<30555200> O-Slot: : t11[0x0000000030566B48] = 0x00000001C2FF3E20
<30555200> O-Slot: : t10[0x0000000030566B50] = 0x00000001C212CF88
<30555200> I-Slot: : t9[0x0000000030566B58] = 0x000000003025DE00
<30555200> I-Slot: : t8[0x0000000030566B60] = 0x00000001C00EAC18
<30555200> I-Slot: : t7[0x0000000030566B68] = 0x0000000000000009
<30555200> I-Slot: : t6[0x0000000030566B70] = 0x00000007FFFAB798
<30555200> I-Slot: : t5[0x0000000030566B78] = 0x000000003025DE00
<30555200> I-Slot: : t4[0x0000000030566B80] = 0x00000007FFFAB880
<30555200> I-Slot: : t3[0x0000000030566B88] = 0x000000003025DE00
<30555200> I-Slot: : t2[0x0000000030566B90] = 0x00000001C0000AE0
<30555200> I-Slot: : t1[0x0000000030566B98] = 0x00000001C212CF58
<30555200> I-Slot: : t0[0x0000000030566BA0] = 0x00000001C212CF70
<30555200> JIT-RegisterMap = UDATA(0x00000000000066DD)
<30555200> JIT-RegisterMap-O-Slot[0x0000000030566A18] = 0x00000001C0353478 (jit_r31)
<30555200> JIT-RegisterMap-I-Slot[0x0000000030566A10] = UDATA(0x0000000000000000) (jit_r30)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F868] = 0x00000001C03370C0 (jit_r29)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F860] = 0x00000001C00C11F8 (jit_r28)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F858] = 0x00000001C2FF3E20 (jit_r27)
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F850] = UDATA(0x000000000000000B) (jit_r26)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F848] = 0x00000001C212CF88 (jit_r25)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F840] = 0x00000001C3040D90 (jit_r24)
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F838] = UDATA(0x0000000030504700) (jit_r23)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F830] = 0x00000001C00EAC18 (jit_r22)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F828] = 0x00000001C3040DA0 (jit_r21) <-----------------
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F820] = UDATA(0x000000003025DE00) (jit_r20)
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F818] = UDATA(0x000000003025DE00) (jit_r19)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F810] = 0x00000001C212CF98 (jit_r18)
<30555200> JIT-RegisterMap-O-Slot[0x0000010028E1F808] = 0x00000001C0960CA8 (jit_r17)
<30555200> JIT-RegisterMap-I-Slot[0x0000010028E1F800] = UDATA(0x0000000000000001) (jit_r16)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A70] = UDATA(0x0000000000000000) (jit_r16)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A78] = UDATA(0xFFFFFFFFFFFFFFFF) (jit_r17)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A80] = UDATA(0x0000000000000000) (jit_r18)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A88] = UDATA(0x09001000A09C4DC0) (jit_r19)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A90] = UDATA(0x0000000000000000) (jit_r20)
<30555200> JIT-Frame-RegisterMap[0x0000000030566A98] = UDATA(0xCAD2BC9B817F20E4) (jit_r21)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AA0] = UDATA(0x0000000000000000) (jit_r22)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AA8] = UDATA(0x00000000A0000000) (jit_r23)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AB0] = UDATA(0x0000010028E1F8C0) (jit_r24)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AB8] = UDATA(0x0000000000000000) (jit_r25)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AC0] = UDATA(0x0000000030507000) (jit_r26)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AC8] = UDATA(0x00000001C02DEED8) (jit_r27)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AD0] = UDATA(0x00000007FFF9D160) (jit_r28)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AD8] = UDATA(0x00000007FFF9D160) (jit_r29)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AE0] = UDATA(0x0000000030501B00) (jit_r30)
<30555200> JIT-Frame-RegisterMap[0x0000000030566AE8] = UDATA(0x00000007FFF9D160) (jit_r31)
With the release available on IBM portal (jdk11-11.0.12+7) I was hitting another bug (__postP10GenericCopy) that was fixed and merged. As there is no official version that contains that fix, I tried to reproduce locally by using latest personal openj9 build.
OpenJDK Runtime Environment (build 11.0.13-internal+0-adhoc.jenkins.BuildJDK11ppc64aixPersonal)
Eclipse OpenJ9 VM (build master-ef609f1cf16, JRE 11 AIX ppc64-64-Bit Compressed References 20210915_748 (JIT enabled, AOT enabled)
OpenJ9 - ef609f1cf16
OMR - c818b04c631
JCL - ddc29ca7606 based on jdk-11.0.13+5)
Test passed for following scenarios -
count=0
count=0,optlevel=noopt
count=0,optlevel=warm
count=0,optlevel=hot
count=0,optlevel=scorching
OPENJ9_JAVA_OPTIONS
Test was ran on the same P10 system.
Closing assuming fixed, it can be re-opened if the problem is seen again.
Another one, job_output.php?id=17203556 - ub20lertp1-8 Linux PPC LE 64bit Compressed Pointers] 80 Load_Level_2.abbs.5mins.Mode610 -Xcompressedrefs -Xjit -Xgcpolicy:gencon
Result store failed so there are no diagnostics.
j> 18:50:18 20211014 18:50:18 Thread Control Engine INFO: Starting JobSet Primary
j> 18:50:31 22:50:31.693 0x17ae00 j9mm.107 * ** ASSERTION FAILED ** at ../../../gc_glue_java/MarkingDelegate.hpp:121: ((false && ((UDATA)0x99669966 == clazz->eyecatcher)))
100x grinder build_info.php?build_id=14736 - passed
@IBMJimmyk looks like a GCMap or missing write-barrier (less likely) issue. I encountered this same stale-pointer issue on AIX7.3 GA1.5 performance evaluation. Kevin and I couldn't nail it down yet. See if you can make more progress with this new core dump (kca has a few commands to help on this type of investigation).
This was detected on AIX power10 (p10aix009.rtp.raleigh.ibm.com). Internal job
cmd_test/213