Open JasonFengJ9 opened 2 years ago
@JasonFengJ9 Do you know are Artifacts stored for this failure? I see link but it returns 404... May be it its wrong link
Jenkins Artifacts link is broken (runtimes/backlog/issues/594
), use the link within the console output instead.
From Nursery subspace !MM_MemorySubSpaceGeneric 0x7276980aad60
Memory Pool !mm_memorypooladdressorderedlist 0x00007276980A5AF0
_heapFreeList = !mm_heaplinkedfreeheader 0x00000000FF8DA238
0xFF8DA200 : 00630065 006f0074 004c0072 0061006f [ e.c.t.o.r.L.o.a. ]
0xFF8DA210 : 00530064 006f0074 00650072 00650054 [ d.S.t.o.r.e.T.e. ]
0xFF8DA220 : 00740073 00400073 00000000 00000000 [ s.t.s.@......... ]
0xFF8DA230 : 00000000 00000000 00000001 00000000 [ ................ ] <---- Linked Free Header size 0
0xFF8DA240 : 00000000 00000000 00392b00 00000008 [ .........+9..... ]
0xFF8DA250 : ff8da230 00000000 0012a000 00000000 [ 0............... ]
0xFF8DA260 : 00000000 00000000 00000000 00000000 [ ................ ]
0xFF8DA270 : 000b6c00 00000002 ff8da280 00000000 [ .l.............. ]
@0xdaryl FYI Looks like this is heap corruption. It seems has no relation (not applicable) to recent changes in Memory Pool code. So I have no idea how this zeros were written to the heap
So again, an assertion has been triggered because there is zero at 0xFF8DA240
. This location happened to be Linked Free Header Size slot. This is not valid value
There are two possibilities:
For first case I can not see how 0 can be written as a size in GC code. Minimum size to be written for Multi Slot Hole (0x1 at 0xFF8DA238
) is 0x10. The size
slot is protected and operated by MM_HeapLinkedFreeHeader
functions only. It is plausible this is HW failure and real value has not been written.
For second case heap corruption there is not enough information to investigate, this is hopeless realistically.
The problem might manifest differently if it happen next time. If we see the same assertion again it might help to understand problem better
10 test iterations grinder /job/Grinder/18584/
10 test iterations grinder
/job/Grinder/18584/
summary for grinder 10 test iterations on 5 machines (50 total):
As this problem has only shown up on PPC LE (and seems readily reproducible), @zl-wang could you assign this for investigation please?
for Iteration 4 for the grinder /job/Grinder_iteration_4/398/consoleFull
I have results downloaded to /team/Dmitri/13574/iteration4/
. Summary of system core included:
./aqa-tests/TKG/output_16335049886092/jdk_vector_1/work/jdk/incubator/vector/Long64VectorLoadStoreTests/core.20211006.011318.27564.0001.dmp <-- assertion
./aqa-tests/TKG/output_16334994679967/jdk_vector_1/work/jdk/incubator/vector/Double64VectorLoadStoreTests/core.20211005.231026.18690.0001.dmp <-- assertion
./aqa-tests/TKG/output_16334663781698/jdk_vector_1/work/jdk/incubator/vector/Double128VectorLoadStoreTests/core.20211005.135420.12035.0001.dmp <-- vmState=0x00040000 Compiled_method=java/util/ArrayList.addAll(Ljava/util/Collection;)Z
./aqa-tests/TKG/output_16334774767826/jdk_vector_1/work/jdk/incubator/vector/Short64VectorLoadStoreTests/core.20211005.174549.28520.0001.dmp <-- vmState=0x00000000
./aqa-tests/TKG/output_16334884518017/jdk_vector_1/work/jdk/incubator/vector/Int64VectorLoadStoreTests/core.20211005.202807.7386.0001.dmp <-- vmState=0x00000000
./aqa-tests/TKG/output_16334828868042/jdk_vector_1/work/jdk/incubator/vector/Long64VectorLoadStoreTests/core.20211005.190642.1687.0001.dmp <-- assertion
I am going to look to assertions in GC cores
@pshipton @tajila @0xdaryl @zl-wang FYI There are multiple failures in jdk_vector_1
test on Linux PPC LE. I am not sure all these failures have the same root or occur on PPC LE only. There are a few cores to look at if you have resources. There are more results are available for download - please let me know if you need them.
@mnalam-p please take a look from JIT perspective
This problem was reported on the head stream. Based on the grinder results, I'm setting as a blocker, but moving to the 0.30 milestone. ~@dmitripivkine can you restart your grinder with a 0.29 JVM to check if the problem occurs there. The latest open build is https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Build_JDK17_ppc64le_linux_Nightly/44/OpenJ9-JDK17-ppc64le_linux-20211005-215708.tar.gz~
can you restart your grinder with a 0.29 JVM to check if the problem occurs there
Actually, this is the vector incubator, which we're not supporting in 0.29 so it doesn't matter.
@gita-omr fyi this problem.
Another occurrence at JDK17 head stream at job/Test_openjdk17_j9_extended.openjdk_ppc64le_linux/21/consoleFull
(ubu18le-rt1-4
)
11:30:44 openjdk version "17" 2021-09-14
11:30:44 IBM Semeru Runtime Open Edition 17+35 (build 17+35)
11:30:44 Eclipse OpenJ9 VM 17+35 (build master-0fe704468, JRE 17 Linux ppc64le-64-Bit Compressed References 20211009_11 (JIT enabled, AOT enabled)
11:30:44 OpenJ9 - 0fe704468
11:30:44 OMR - 8021d5725
11:30:44 JCL - 92442f0e28 based on jdk-17+35)
16:56:49 STDERR:
16:56:49 WARNING: Using incubator modules: jdk.incubator.vector, jdk.incubator.foreign
16:56:49 20:56:33.610 0x239600 j9mm.107 * ** ASSERTION FAILED ** at ../../../../../../omr/gc/base/MemoryPoolAddressOrderedList.cpp:633: ((false && (freeEntrySize >= _minimumFreeEntrySize)))
16:56:49 JVMDUMP039I Processing dump event "traceassert", detail "" at 2021/10/09 13:56:33 - please wait.
Observed at job/Test_openjdk17_j9_extended.openjdk_aarch64_linux/30/
(cent7-aarch64-4
)
openjdk version "17.0.1" 2021-10-19
IBM Semeru Runtime Certified Edition 17.0.1+12 (build 17.0.1+12)
Eclipse OpenJ9 VM 17.0.1+12 (build master-5afc033b2, JRE 17 Linux aarch64-64-Bit Compressed References 20211204_49 (JIT enabled, AOT enabled)
OpenJ9 - 5afc033b2
OMR - 5e9418dcf
JCL - a60fb4eda99 based on jdk-17.0.1+12)
[2021-12-04T19:17:39.026Z] variation: Mode650
[2021-12-04T19:17:39.026Z] JVM_OPTIONS: -XX:-UseCompressedOops
[2021-12-04T19:24:00.246Z] --------------------------------------------------
[2021-12-04T19:24:00.246Z] TEST: jdk/incubator/vector/Double64VectorLoadStoreTests.java
[2021-12-04T19:24:00.250Z] STDERR:
[2021-12-04T19:24:00.250Z] WARNING: Using incubator modules: jdk.incubator.vector, jdk.incubator.foreign
[2021-12-04T19:24:00.250Z] 19:23:48.036 0xffff9c3f6100 j9mm.107 * ** ASSERTION FAILED ** at ../../../../../../omr/gc/base/MemoryPoolAddressOrderedList.cpp:633: ((false && (freeEntrySize >= _minimumFreeEntrySize)))
[2021-12-04T19:24:00.250Z] JVMDUMP039I Processing dump event "traceassert", detail "" at 2021/12/04 14:23:48 - please wait.
[2021-12-04T19:46:35.923Z] jdk_vector_1_FAILED
@IBMJimmyk please take a look
Okay, I will take a look.
Failure link
From an internal build
job/Test_openjdk17_j9_extended.openjdk_ppc64le_linux/20/
(ub20lert-1-7
):Rerun in Grinder - Change TARGET to run only the failed test targets.
Optional info
Failure output (captured from console output)
fyi @dmitripivkine