eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

DaaLoadTest_daa2_special_22 hang #11526

Closed pshipton closed 3 years ago

pshipton commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Release_testList_4/2 DaaLoadTest_daa2_special_22 aix71-p8-3

variation: Mode614 JVM_OPTIONS: -Xcompressedrefs -Xgcpolicy:gencon -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xmn512k -Xcheck:gc:vmthreads:all:quiet

Diagnostics: https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Release_testList_4/2/system_test_output.tar.gz

javacore shows:

2LKREGMON          GCExtensions::gcExclusiveAccessMutex lock (0x0000010010051100): <unowned>
3LKNOTIFYQ            Waiting to be notified:
3LKWAITNOTIFY            "load-3" (J9VMThread:0x0000000030451100)
3LKWAITNOTIFY            "load-13" (J9VMThread:0x000000003046BE00)
3XMTHREADINFO      "load-3" J9VMThread:0x0000000030451100, omrthread_t:0x0000010022708A20, java/lang/Thread:0x00000000800BB420, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x29, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x20E00B5, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00001081)
3XMCPUTIME               CPU usage total: 0.624548000 secs, user: 0.605136000 secs, system: 0.019412000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/StringBuilder.ensureCapacityImpl(StringBuilder.java:802(Compiled Code))
4XESTACKTRACE                at java/lang/StringBuilder.append(StringBuilder.java:588(Compiled Code))
4XESTACKTRACE                at net/openj9/test/Utils.charArrayToString(Utils.java:1199(Compiled Code))
4XESTACKTRACE                at net/openj9/test/Utils.makeTestNameConverterValue(Utils.java:805(Compiled Code))
4XESTACKTRACE                at net/openj9/test/decimals/TestDecimalData.testConvertBigDecimalNormals(TestDecimalData.java:6812(Compiled Code))
3XMTHREADINFO      "load-13" J9VMThread:0x000000003046BE00, omrthread_t:0x0000010024210BA8, java/lang/Thread:0x00000000801EA840, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x33, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x49701C5, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00001081)
3XMCPUTIME               CPU usage total: 0.573284000 secs, user: 0.555769000 secs, system: 0.017515000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/String.<init>(String.java:841(Compiled Code))
4XESTACKTRACE                at java/lang/String.substring(String.java:2739(Compiled Code))
4XESTACKTRACE                at java/math/BigInteger.<init>(BigInteger.java:469(Compiled Code))
4XESTACKTRACE                at java/math/BigInteger.<init>(BigInteger.java:606(Compiled Code))
4XESTACKTRACE                at com/ibm/dataaccess/DecimalData.slowSignedPackedToBigDecimal(DecimalData.java:2554(Compiled Code))
4XESTACKTRACE                at com/ibm/dataaccess/DecimalData.convertPackedDecimalToBigDecimal(DecimalData.java:1325(Compiled Code))
4XESTACKTRACE                at com/ibm/dataaccess/DecimalData.convertUnicodeDecimalToBigDecimal(DecimalData.java:2247(Compiled Code))
4XESTACKTRACE                at net/openj9/test/decimals/TestDecimalData.testConvertBigDecimalNormals(TestDecimalData.java:7968(Compiled Code))
4XESTACKTRACE                at net/openj9/test/decimals/TestDecimalData.testNonExceptions(TestDecimalData.java:101(Compiled Code))
pshipton commented 3 years ago

@dmitripivkine fyi. Note this failed in the 0.24 release stream.

dmitripivkine commented 3 years ago

This hang occur because GC is waiting for JIT/GC class unload mutex lock to be acquired on the way to System GC in thread "load-6":

3XMTHREADINFO      "load-6" J9VMThread:0x0000000030457F00, omrthread_t:0x0000010023BCE0A0, java/lang/Thread:0x00000000800BB530, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x2C, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x519010F, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00001020)
3XMCPUTIME               CPU usage total: 0.619328000 secs, user: 0.598040000 secs, system: 0.021288000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/Runtime.gc(Native Method)
4XESTACKTRACE                at java/lang/System.gc(System.java:354)
4XESTACKTRACE                at net/openj9/test/arithmetics/TestArithmeticOperations.testGreaterThanPackedDecimal(TestArithmeticOperations.java:1838(Compiled Code))

3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               _event_wait+0x344 (0x09000000005843C8 [libpthreads.a+0x173c8])
4XENATIVESTACK               _cond_wait_local+0x4dc (0x0900000000592920 [libpthreads.a+0x25920])
4XENATIVESTACK               _cond_wait+0xc8 (0x0900000000592F0C [libpthreads.a+0x25f0c])
4XENATIVESTACK               pthread_cond_wait+0x19c (0x0900000000593B60 [libpthreads.a+0x26b60])
4XENATIVESTACK               IPRA.$monitor_wait_original+0x9b0 (0x0900000010746D34 [libj9thr29.so+0x9d34])
4XENATIVESTACK               omrthread_monitor_wait+0x40 (0x0900000010746104 [libj9thr29.so+0x9104])
4XENATIVESTACK               omrthread_rwmutex_enter_write+0x50 (0x0900000010752B34 [libj9thr29.so+0x15b34])
4XENATIVESTACK               enterClassUnloadMutex__21MM_ClassLoaderManagerFP18MM_EnvironmentBase+0xd0 (0x0900000011D4F874 [libj9gc29.so+0x12874])
4XENATIVESTACK               mainThreadGarbageCollectStarted__26MM_GlobalCollectorDelegateFP18MM_EnvironmentBase+0xe8 (0x0900000011DD34EC [libj9gc29.so+0x964ec])

2LKREGMON          JIT/GC class unload mutex lock (0x0000010010052A88): <unowned>
3LKNOTIFYQ            Waiting to be notified:
3LKWAITNOTIFY            "load-6" (J9VMThread:0x0000000030457F00)

This JIT/GC class unload mutex lock supposes to be released by JIT Compilation threads by GC request but seems it was not.

@0xdaryl @gita-omr FYI

pshipton commented 3 years ago

Started a 60x (10x by 6 machines) grinder to see if it reproduces https://ci.eclipse.org/openj9/job/Grinder/1320/

pshipton commented 3 years ago

The grinder passed.

dmitripivkine commented 3 years ago

FYI: One of JIT Compilation threads is not waiting but doing something for more than 100 minutes:

3XMTHREADINFO      "JIT Compilation Thread-006" J9VMThread:0x000000003001E600, omrthread_t:0x0000010020BEF350, java/lang/Thread:0x0000000080052A90, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x8, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x51E0291, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 6437.343774000 secs, user: 6437.232967000 secs, system: 0.110807000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x090000001117B110 [libj9jit29.so+0x88a110])
4XENATIVESTACK               (0x0900000010CAD228 [libj9jit29.so+0x3bc228])
4XENATIVESTACK               (0x0900000010CAAF24 [libj9jit29.so+0x3b9f24])
4XENATIVESTACK               (0x0900000010B0D728 [libj9jit29.so+0x21c728])
4XENATIVESTACK               omrsig_protect+0x488 (0x09000000107BABCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x0900000010AF15B8 [libj9jit29.so+0x2005b8])
4XENATIVESTACK               (0x090000001121E610 [libj9jit29.so+0x92d610])
4XENATIVESTACK               (0x0900000011220B4C [libj9jit29.so+0x92fb4c])
4XENATIVESTACK               (0x090000001121FE18 [libj9jit29.so+0x92ee18])
4XENATIVESTACK               omrsig_protect+0x488 (0x09000000107BABCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x090000001121F5F4 [libj9jit29.so+0x92e5f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x0900000010741808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
dmitripivkine commented 3 years ago

I have test results downloaded (full set including system core, java core etc) in case somebody wants to look

pshipton commented 3 years ago

Another one https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/22 DaaLoadTest_daa1_special_22

Diagnostics (just javacore's) https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/22/system_test_output.tar.gz

3XMTHREADINFO      "load-13" J9VMThread:0x0000000030463800, omrthread_t:0x000001002415F688, java/lang/Thread:0x00000000802238C0, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x33, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x53C02BD, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 0.144760000 secs, user: 0.139528000 secs, system: 0.005232000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/math/MutableBigInteger.divideKnuth(MutableBigInteger.java:1215)
4XESTACKTRACE                at java/math/MutableBigInteger.divide(MutableBigInteger.java:1153)
4XESTACKTRACE                at java/math/MutableBigInteger.divide(MutableBigInteger.java:1147)
4XESTACKTRACE                at java/math/BigInteger.smallToString(BigInteger.java:3832)
4XESTACKTRACE                at java/math/BigInteger.toString(BigInteger.java:3877)
4XESTACKTRACE                at java/math/BigInteger.toString(BigInteger.java:3905)
4XESTACKTRACE                at java/math/BigInteger.toString(BigInteger.java:3804)
4XESTACKTRACE                at java/math/BigInteger.toString(BigInteger.java:3959)
4XESTACKTRACE                at net/openj9/test/Utils.toPrecision(Utils.java:489)
4XESTACKTRACE                at net/openj9/test/arithmetics/TestPerformance.getReferenceValues(TestPerformance.java:433)
4XESTACKTRACE                at net/openj9/test/arithmetics/TestPerformance.testAdd(TestPerformance.java:105)
4XESTACKTRACE                at net/openj9/test/arithmetics/TestPerformance.testIntrinsicPerformance(TestPerformance.java:76)
pshipton commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/24 MathLoadTest_bigdecimal_special_17 variation: Mode555 JVM_OPTIONS: -XX:+UseCompressedOops -Xgcpolicy:balanced -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xcheck:gc:vmthreads:all:quiet

Diagnostics (just javacore's): https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/24/system_test_output.tar.gz

3XMTHREADINFO      "load-13" J9VMThread:0x0000000030463700, omrthread_t:0x000001002495E548, java/lang/Thread:0x0000000076C01B38, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x33, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x197032B, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000001)
3XMCPUTIME               CPU usage total: 2.957708000 secs, user: 2.944704000 secs, system: 0.013004000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at sun/reflect/NativeMethodAccessorImpl.invoke0(Native Method)
4XESTACKTRACE                at sun/reflect/NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62(Compiled Code))
4XESTACKTRACE                at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43(Compiled Code))
4XESTACKTRACE                at java/lang/reflect/Method.invoke(Method.java:498(Compiled Code))
4XESTACKTRACE                at junit/framework/TestCase.runTest(TestCase.java:176(Compiled Code))
dmitripivkine commented 3 years ago

Another one https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/22 DaaLoadTest_daa1_special_22

There is the same problem - JIT Compilation thread is running for ~10740 seconds and keeps GC class unload mutex lock and prevents STW GC to start:

3XMTHREADINFO      "JIT Compilation Thread-001" J9VMThread:0x0000000030013000, omrthread_t:0x00000100208C1940, java/lang/Thread:0x00000000800384D0, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x3, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x23D02CF, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 10739.630518000 secs, user: 10739.594203000 secs, system: 0.036315000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x090000000CFD0A10 [libj9jit29.so+0x889a10])
4XENATIVESTACK               (0x090000000CB02228 [libj9jit29.so+0x3bb228])
4XENATIVESTACK               (0x090000000CAFFF24 [libj9jit29.so+0x3b8f24])
4XENATIVESTACK               (0x090000000C963868 [libj9jit29.so+0x21c868])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000000ABF2BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x090000000C9476B8 [libj9jit29.so+0x2006b8])
4XENATIVESTACK               (0x090000000D074110 [libj9jit29.so+0x92d110])
4XENATIVESTACK               (0x090000000D07664C [libj9jit29.so+0x92f64c])
4XENATIVESTACK               (0x090000000D075918 [libj9jit29.so+0x92e918])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000000ABF2BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x090000000D0750F4 [libj9jit29.so+0x92e0f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x090000000A71B808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
dmitripivkine commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/24 MathLoadTest_bigdecimal_special_17 variation: Mode555 JVM_OPTIONS: -XX:+UseCompressedOops -Xgcpolicy:balanced -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xcheck:gc:vmthreads:all:quiet

Diagnostics (just javacore's): https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/24/system_test_output.tar.gz

This is the same problem as well, just different stack on GC side due Balanced:

3XMTHREADINFO      "Dedicated GC Main" J9VMThread:0x00000000300ED800, omrthread_t:0x00000100233FA3C8, java/lang/Thread:0x00000000764185C8, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0xC, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x1F7011F, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x000000a1)
3XMCPUTIME               CPU usage total: 0.321794000 secs, user: 0.312383000 secs, system: 0.009411000 secs, current category="GC"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               _event_wait+0x344 (0x09000000005843C8 [libpthreads.a+0x173c8])
4XENATIVESTACK               _cond_wait_local+0x4dc (0x0900000000592920 [libpthreads.a+0x25920])
4XENATIVESTACK               _cond_wait+0xc8 (0x0900000000592F0C [libpthreads.a+0x25f0c])
4XENATIVESTACK               pthread_cond_wait+0x19c (0x0900000000593B60 [libpthreads.a+0x26b60])
4XENATIVESTACK               IPRA.$monitor_wait_original+0x9b0 (0x09000000128ABD34 [libj9thr29.so+0x9d34])
4XENATIVESTACK               omrthread_monitor_wait+0x40 (0x09000000128AB104 [libj9thr29.so+0x9104])
4XENATIVESTACK               omrthread_rwmutex_enter_write+0x50 (0x09000000128B7B34 [libj9thr29.so+0x15b34])
4XENATIVESTACK               enterClassUnloadMutex__21MM_ClassLoaderManagerFP18MM_EnvironmentBase+0xd0 (0x0900000013EB7874 [libj9gc29.so+0x12874])
4XENATIVESTACK               unloadDeadClassLoaders__28MM_IncrementalGenerationalGCFP19MM_EnvironmentVLHGC+0x4cc (0x0900000014124DF0 [libj9gc29.so+0x27fdf0])
4XENATIVESTACK               globalMarkPhase__28MM_IncrementalGenerationalGCFP19MM_EnvironmentVLHGCb+0xd54 (0x0900000014123718 [libj9gc29.so+0x27e718])
4XENATIVESTACK               runGlobalMarkPhaseIncrement__28MM_IncrementalGenerationalGCFP19MM_EnvironmentVLHGC+0x950 (0x0900000014121694 [libj9gc29.so+0x27c694])
4XENATIVESTACK               mainThreadGarbageCollect__28MM_IncrementalGenerationalGCFP18MM_EnvironmentBaseP22MM_AllocateDescriptionbT3+0x284 (0x0900000014120768 [libj9gc29.so+0x27b768])
4XENATIVESTACK               mainThreadEntryPoint__15MM_MainGCThreadFv+0x30c (0x0900000014065730 [libj9gc29.so+0x1c0730])
4XENATIVESTACK               main_thread_proc2__15MM_MainGCThreadFP14OMRPortLibraryPv+0x18 (0x09000000140652DC [libj9gc29.so+0x1c02dc])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000001291FBCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               main_thread_proc__15MM_MainGCThreadFPv+0x54 (0x0900000014065A18 [libj9gc29.so+0x1c0a18])
4XENATIVESTACK               thread_wrapper+0x524 (0x09000000128A6808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])

This JIT Compilation thread is a problem:

3XMTHREADINFO      "JIT Compilation Thread-000" J9VMThread:0x0000000030010B00, omrthread_t:0x00000100100536D0, java/lang/Thread:0x00000000764180F0, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x2, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x5F60211, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 2119.650525000 secs, user: 2119.643982000 secs, system: 0.006543000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x09000000132DFA10 [libj9jit29.so+0x889a10])
4XENATIVESTACK               (0x0900000012E11228 [libj9jit29.so+0x3bb228])
4XENATIVESTACK               (0x0900000012E0EF24 [libj9jit29.so+0x3b8f24])
4XENATIVESTACK               (0x0900000012C72868 [libj9jit29.so+0x21c868])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000001291FBCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x0900000012C566B8 [libj9jit29.so+0x2006b8])
4XENATIVESTACK               (0x0900000013383110 [libj9jit29.so+0x92d110])
4XENATIVESTACK               (0x090000001338564C [libj9jit29.so+0x92f64c])
4XENATIVESTACK               (0x0900000013384918 [libj9jit29.so+0x92e918])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000001291FBCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x09000000133840F4 [libj9jit29.so+0x92e0f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x09000000128A6808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
pshipton commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/26 DaaLoadTest_daa1_special_17 variation: Mode555 JVM_OPTIONS: -XX:+UseCompressedOops -Xgcpolicy:balanced -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xcheck:gc:vmthreads:all:quiet

Diagnostics: https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/26/system_test_output.tar.gz

3XMTHREADINFO      "JIT Compilation Thread-005" J9VMThread:0x000000003001C200, omrthread_t:0x0000010021F9C8E8, java/lang/Thread:0x00000000765AFFB8, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x7, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x1F3038D, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 10775.432726000 secs, user: 10775.425218000 secs, system: 0.007508000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x09000000133F0A10 [libj9jit29.so+0x889a10])
4XENATIVESTACK               (0x0900000012F22228 [libj9jit29.so+0x3bb228])
4XENATIVESTACK               (0x0900000012F1FF24 [libj9jit29.so+0x3b8f24])
4XENATIVESTACK               (0x0900000012D83868 [libj9jit29.so+0x21c868])
4XENATIVESTACK               omrsig_protect+0x488 (0x0900000012A30BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x0900000012D676B8 [libj9jit29.so+0x2006b8])
4XENATIVESTACK               (0x0900000013494110 [libj9jit29.so+0x92d110])
4XENATIVESTACK               (0x090000001349664C [libj9jit29.so+0x92f64c])
4XENATIVESTACK               (0x0900000013495918 [libj9jit29.so+0x92e918])
4XENATIVESTACK               omrsig_protect+0x488 (0x0900000012A30BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x09000000134950F4 [libj9jit29.so+0x92e0f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x0900000012548808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
3XMTHREADINFO      "load-13" J9VMThread:0x0000000030467B00, omrthread_t:0x0000010024680F28, java/lang/Thread:0x0000000070F01820, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x33, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x5802E7, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 0.777730000 secs, user: 0.774231000 secs, system: 0.003499000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/util/Formatter$FormatSpecifier.<init>(Formatter.java:2708(Compiled Code))
4XESTACKTRACE                at java/util/Formatter.parse(Formatter.java:2560(Compiled Code))
4XESTACKTRACE                at java/util/Formatter.format(Formatter.java:2501(Compiled Code))
4XESTACKTRACE                at java/util/Formatter.format(Formatter.java:2455(Compiled Code))
4XESTACKTRACE                at java/lang/String.format(String.java:4242(Compiled Code))
4XESTACKTRACE                at net/openj9/test/PDMoveShifts/TestShiftsAndConvert.testShiftRight(TestShiftsAndConvert.java:3317(Compiled Code))
4XESTACKTRACE                at net/openj9/test/PDMoveShifts/TestShiftsAndConvert.testShiftRightSimpleRound(TestShiftsAndConvert.java:3386(Compiled Code))
pshipton commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/28 DaaLoadTest_daa1_special_17

gita-omr commented 3 years ago

I think we saw a few failures with classUnloadMutex being corrupted. One of them was happening with -Xint: https://github.com/eclipse/openj9/issues/11040 so I think it needs to be fixed first.

pshipton commented 3 years ago

@gita-omr note that #11040 only occurred when https://github.com/eclipse/openj9/pull/10930 was merged, however this change was reverted.

gita-omr commented 3 years ago

My impression was that something similar was occurring even without #10930 but less often...

pshipton commented 3 years ago

I'm not aware that we've seen the problem without #10930. Note this is a new problem, not seen before Dec 19th.

pshipton commented 3 years ago

Note this has only been seen in these modes.

VM_OPTIONS: -Xcompressedrefs -Xgcpolicy:gencon -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xmn512k -Xcheck:gc:vmthreads:all:quiet

JVM_OPTIONS: -XX:+UseCompressedOops -Xgcpolicy:balanced -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xcheck:gc:vmthreads:all:quiet

dmitripivkine commented 3 years ago

@gita-omr I am pretty sure this is different problem. We have seen similar problems in the past. It was related with (something like) infinite loop Compilation thread going in due wrong data (sorry again, I never knew low level details on JIT side regarding this). I do have one system core preserved if somebody wants to look at

gita-omr commented 3 years ago

Sure, we’ll take a look. Thanks.

pshipton commented 3 years ago

Perhaps related, MathLoadTest_all_special_17 hung, but there are no diagnostics. Opened https://github.com/AdoptOpenJDK/openjdk-systemtest/issues/389

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_4/34 23:40:11 variation: Mode555 23:40:11 JVM_OPTIONS: -XX:+UseCompressedOops -Xgcpolicy:balanced -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xcheck:gc:vmthreads:all:quiet

23:44:39  MLT 23:44:36.580 - Completed 10.0%. Number of tests started=523 (+16)
23:44:58  MLT 23:44:56.567 - Completed 12.5%. Number of tests started=658 (+135)
23:45:15  STF 23:45:14.023 - Heartbeat: Process MLT is still running
23:45:17  MLT 23:45:16.587 - Completed 15.0%. Number of tests started=787 (+129)
23:50:16  STF 23:50:14.149 - Heartbeat: Process MLT is still running
23:55:26  STF 23:55:14.208 - Heartbeat: Process MLT is still running
00:00:23  STF 00:00:14.296 - Heartbeat: Process MLT is still running
00:05:18  STF 00:05:14.400 - Heartbeat: Process MLT is still running
00:10:29  STF 00:10:14.453 - Heartbeat: Process MLT is still running
00:15:24  STF 00:15:14.498 - Heartbeat: Process MLT is still running
00:20:20  STF 00:20:14.034 - Heartbeat: Process MLT is still running
...

Runs like this until terminated by jenkins.

pshipton commented 3 years ago

Similar but different. https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/35/ LambdaLoadTest_OpenJ9_NonLinux_special_17

3XMTHREADINFO      "JIT Compilation Thread-006" J9VMThread:0x000000003001E600, omrthread_t:0x0000010021F9D210, java/lang/Thread:0x000000007026AF78, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x8, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x68701AD, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000081)
3XMCPUTIME               CPU usage total: 2184.193430000 secs, user: 2184.191159000 secs, system: 0.002271000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x0900000016FCBA10 [libj9jit29.so+0x889a10])
4XENATIVESTACK               (0x0900000016AFD228 [libj9jit29.so+0x3bb228])
4XENATIVESTACK               (0x0900000016AFAF24 [libj9jit29.so+0x3b8f24])
4XENATIVESTACK               (0x090000001695E868 [libj9jit29.so+0x21c868])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000001494EBCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x09000000169426B8 [libj9jit29.so+0x2006b8])
4XENATIVESTACK               (0x090000001706F110 [libj9jit29.so+0x92d110])
4XENATIVESTACK               (0x090000001707164C [libj9jit29.so+0x92f64c])
4XENATIVESTACK               (0x0900000017070918 [libj9jit29.so+0x92e918])
4XENATIVESTACK               omrsig_protect+0x488 (0x090000001494EBCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x09000000170700F4 [libj9jit29.so+0x92e0f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x09000000148D5968 [libj9thr29.so+0x4968])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
3XMTHREADINFO      "load-13" J9VMThread:0x0000000030463B00, omrthread_t:0x00000100249510C8, java/lang/Thread:0x0000000071415990, state:CW, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x33, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x2E90195, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000181)
3XMCPUTIME               CPU usage total: 0.007482000 secs, user: 0.006310000 secs, system: 0.001172000 secs, current category="Application"
3XMTHREADBLOCK     Waiting on: java/lang/J9VMInternals$ClassInitializationLock@0x00000000716EFB88 Owned by: <unowned>
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at org/junit/runners/ParentRunner.validateClassRules(ParentRunner.java:160)
4XESTACKTRACE                at org/junit/runners/ParentRunner.collectInitializationErrors(ParentRunner.java:127)
4XESTACKTRACE                at org/junit/runners/ParentRunner.validate(ParentRunner.java:416)
4XESTACKTRACE                at org/junit/runners/ParentRunner.<init>(ParentRunner.java:84)
4XESTACKTRACE                at org/junit/runners/Suite.<init>(Suite.java:112)
4XESTACKTRACE                at org/junit/runners/Suite.<init>(Suite.java:81)
4XESTACKTRACE                at org/junit/runner/Computer.getSuite(Computer.java:28)
4XESTACKTRACE                at org/junit/runner/Request.classes(Request.java:75)
4XESTACKTRACE                at org/junit/runner/Request.classes(Request.java:91)
4XESTACKTRACE                at net/adoptopenjdk/loadTest/adaptors/JUnitAdaptor.executeTest(JUnitAdaptor.java:129)
4XESTACKTRACE                at net/adoptopenjdk/loadTest/LoadTestRunner$2.run(LoadTestRunner.java:182)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:823(Compiled Code))
pshipton commented 3 years ago

In a 0.24.0 release build. https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Release_testList_0/8 MauveMultiThrdLoad_special_19 variation: Mode614 JVM_OPTIONS: -Xcompressedrefs -Xgcpolicy:gencon -Xjit:counts=- - - - - - 1 1 1 1000 250 250 - - - 10000 100000 10000,gcOnResolve,rtResolve,sampleInterval=2,scorchingSampleThreshold=10000,quickProfile -Xmn512k -Xcheck:gc:vmthreads:all:quiet

3XMTHREADINFO      "JIT Compilation Thread-003" J9VMThread:0x0000000030017900, omrthread_t:0x00000100208C2DF0, java/lang/Thread:0x0000000080037B58, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x5, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x69E00CD, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000000)
3XMCPUTIME               CPU usage total: 4259.764781000 secs, user: 4259.757142000 secs, system: 0.007639000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x09000000132E6110 [libj9jit29.so+0x88a110])
4XENATIVESTACK               (0x0900000012E18228 [libj9jit29.so+0x3bc228])
4XENATIVESTACK               (0x0900000012E15F24 [libj9jit29.so+0x3b9f24])
4XENATIVESTACK               (0x0900000012C78728 [libj9jit29.so+0x21c728])
4XENATIVESTACK               omrsig_protect+0x488 (0x0900000012925BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x0900000012C5C5B8 [libj9jit29.so+0x2005b8])
4XENATIVESTACK               (0x0900000013389610 [libj9jit29.so+0x92d610])
4XENATIVESTACK               (0x090000001338BB4C [libj9jit29.so+0x92fb4c])
4XENATIVESTACK               (0x090000001338AE18 [libj9jit29.so+0x92ee18])
4XENATIVESTACK               omrsig_protect+0x488 (0x0900000012925BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               (0x090000001338A5F4 [libj9jit29.so+0x92e5f4])
4XENATIVESTACK               thread_wrapper+0x524 (0x090000001243D808 [libj9thr29.so+0x4808])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
2LKREGMON          JIT-CompilationQueueMonitor lock (0x00000100105EFD50): <unowned>
3LKNOTIFYQ            Waiting to be notified:
3LKWAITNOTIFY            "main" (J9VMThread:0x000000003000E700)
3XMTHREADINFO      "main" J9VMThread:0x000000003000E700, omrthread_t:0x0000010010052D80, java/lang/Thread:0x0000000080013220, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x1, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0xA0351, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000000)
3XMCPUTIME               CPU usage total: 1.167528000 secs, user: 1.040110000 secs, system: 0.127418000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=81600 (0x13EC0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/Shutdown.halt0(Native Method)
4XESTACKTRACE                at java/lang/Shutdown.halt(Shutdown.java:142)
5XESTACKTRACE                   (entered lock: java/lang/Shutdown$Lock@0x0000000080037D48, entry count: 1)
4XESTACKTRACE                at java/lang/Shutdown.exit(Shutdown.java:217)
5XESTACKTRACE                   (entered lock: java/lang/Shutdown@0x0000000080050A20, entry count: 1)
4XESTACKTRACE                at java/lang/Runtime.exit(Runtime.java:109)
4XESTACKTRACE                at java/lang/System.exit(System.java:345)
4XESTACKTRACE                at net/adoptopenjdk/loadTest/LoadTest.main(LoadTest.java:130)
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               _event_wait+0x344 (0x09000000005843C8 [libpthreads.a+0x173c8])
4XENATIVESTACK               _cond_wait_local+0x4dc (0x0900000000592920 [libpthreads.a+0x25920])
4XENATIVESTACK               _cond_wait+0xc8 (0x0900000000592F0C [libpthreads.a+0x25f0c])
4XENATIVESTACK               pthread_cond_wait+0x19c (0x0900000000593B60 [libpthreads.a+0x26b60])
4XENATIVESTACK               IPRA.$monitor_wait_original+0x9b0 (0x0900000012442D34 [libj9thr29.so+0x9d34])
4XENATIVESTACK               omrthread_monitor_wait+0x40 (0x0900000012442104 [libj9thr29.so+0x9104])
4XENATIVESTACK               (0x0900000012C55794 [libj9jit29.so+0x1f9794])
4XENATIVESTACK               (0x090000001338E43C [libj9jit29.so+0x93243c])
4XENATIVESTACK               (0x090000001338D5A0 [libj9jit29.so+0x9315a0])
4XENATIVESTACK               exitJavaVM+0x330 (0x09000000126424F4 [libj9vm29.so+0x84f4])
4XENATIVESTACK               JVM_Halt_Impl+0xbc (0x09000000145E2CA0 [libjclse29.so+0x92ca0])
4XENATIVESTACK               JVM_Halt+0x44 (0x09000000125E8BE8 [libjvm.so+0xfbe8])
4XENATIVESTACK               JVM_Halt+0xc4 (0x09000000125A73C8 [libjvm.so+0x63c8])
4XENATIVESTACK               Java_java_lang_Shutdown_halt0+0x10 (0x09000000146273F4 [libjava.so+0x143f4])
4XENATIVESTACK               (0x090000001280D63C [libj9vm29.so+0x1d363c])
4XENATIVESTACK               ffi_call+0x98 (0x090000001280C91C [libj9vm29.so+0x1d291c])
4XENATIVESTACK               (0x0900000012811FE4 [libj9vm29.so+0x1d7fe4])
4XENATIVESTACK               (0x09000000126AA27C [libj9vm29.so+0x7027c])
4XENATIVESTACK               runCallInMethod+0x264 (0x09000000126A6D48 [libj9vm29.so+0x6cd48])
4XENATIVESTACK               gpProtectedRunCallInMethod__FPv+0x48 (0x090000001269744C [libj9vm29.so+0x5d44c])
4XENATIVESTACK               signalProtectAndRunGlue+0x28 (0x09000000126910AC [libj9vm29.so+0x570ac])
4XENATIVESTACK               omrsig_protect+0x488 (0x0900000012925BCC [libj9prt29.so+0x59bcc])
4XENATIVESTACK               gpProtectAndRun+0xf8 (0x09000000126911FC [libj9vm29.so+0x571fc])
4XENATIVESTACK               gpCheckCallin+0x114 (0x09000000126973B8 [libj9vm29.so+0x5d3b8])
4XENATIVESTACK               callStaticVoidMethod+0x48 (0x09000000126F130C [libj9vm29.so+0xb730c])
4XENATIVESTACK               (0x0000010000004838 [java+0x4838])
4XENATIVESTACK               _pthread_body+0xf0 (0x0900000000570E14 [libpthreads.a+0x3e14])
IBMJimmyk commented 3 years ago

I was looking at the problem in: https://github.com/eclipse/openj9/issues/11526#issuecomment-763320863

I downloaded the build from: https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Build_JDK8_ppc64_aix_Release/60/OpenJ9-JDK8-ppc64_aix-20210119-160857.tar.gz

Version output:

java -Xshareclasses:none -Xnoaot -version
openjdk version "1.8.0_282-internal"
OpenJDK Runtime Environment (build 1.8.0_282-internal-jenkins_2021_01_19_16_11-b00)
Eclipse OpenJ9 VM (build v0.24.0-release-345e1b09e2a, JRE 1.8.0 AIX ppc64-64-Bit Compressed References 20210119_60 (JIT enabled, AOT disabled)
OpenJ9   - 345e1b09e2a
OMR      - 741e94ea867
JCL      - 4989cf9d146 based on jdk8u282-b08)

However, I ran into an issue where it looks like the symbols have largely been stripped from libj9jit29.so. This is based both off trying to view backtrace information in a debugger and also using the dump -X any -c -T libj9jit29.so command to verify that symbols can not be found. As a result, it would be challenging to figure out what is going on from the information on hand.

Is it normal for the symbol information to be stripped? If possible, it would be helpful to get another reproduction of the failure with a build that does have symbol information. I am currently trying to see if I can do this myself.

pshipton commented 3 years ago

Downloading the debug image and overlaying it may help. https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Build_JDK8_ppc64_aix_Release/60/debug-image.tar.gz

pshipton commented 3 years ago

On AIX the debug image contains the non-stripped libraries.

IBMJimmyk commented 3 years ago

Overlaying the debug image looks like it helped. Thanks!

So the backtrace in the corefile for JIT Compilation Thread-003 looks like this:

(kca) where
0x0900000012f72594 {libj9jit29.so}{addProfileInfo__18TR_JProfilerThreadFP24TR_PersistentProfileInfo} [0x10020be18d0]
0x0900000012f72308 {libj9jit29.so}{findOrCreateProfileInfo__Q2_2J913RecompilationFv} [0x10020be18d0]
0x09000000132e6110 {libj9jit29.so}{beforeOptimization__Q2_2J913RecompilationFv} [0x10020be19a0]
0x0900000012e18228 {libj9jit29.so}{compile__Q2_3OMR11CompilationFv} [0x10020be1a60]
0x0900000012e15f24 {libj9jit29.so}{compile__Q2_2TR28CompilationInfoPerThreadBaseFP10J9VMThreadPQ2_2TR11CompilationP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRCQ2_2TR16SegmentAllocator} [0x10020be3370]
0x0900000012c78728 {libj9jit29.so}{wrappedCompile__Q2_2TR28CompilationInfoPerThreadBaseFP13J9PortLibraryPv} [0x10020be3550]
0x0900000012925bcc {libj9prt29.so}{omrsig_protect} [0x10020be3760]
0x0900000012c5c5b8 {libj9jit29.so}{compile__Q2_2TR28CompilationInfoPerThreadBaseFP10J9VMThreadP21TR_MethodToBeCompiledRQ2_2J917J9SegmentProvider} [0x10020be3b30]
0x0900000013389610 {libj9jit29.so}{processEntry__Q2_2TR24CompilationInfoPerThreadFR21TR_MethodToBeCompiledRQ2_2J917J9SegmentProvider} [0x10020be4f00]
0x090000001338bb4c {libj9jit29.so}{processEntries__Q2_2TR24CompilationInfoPerThreadFv} [0x10020be4fd0]
0x090000001338ae18 {libj9jit29.so}{protectedCompilationThreadProc__FP13J9PortLibraryPQ2_2TR24CompilationInfoPerThread} [0x10020be5130]
0x0900000012925bcc {libj9prt29.so}{omrsig_protect} [0x10020be51f0]
0x090000001338a5f4 {libj9jit29.so}{compilationThreadProc__FPv} [0x10020be55c0]
0x090000001243d808 {libj9thr29.so}{thread_wrapper} [0x10020be5670]
0x0900000000570e14 {libpthreads.a}{} [0x10020be5790]
IBMJimmyk commented 3 years ago

https://github.com/eclipse/openj9/blob/345e1b09e2a1f2cf6323b25edc901cce197f4365/runtime/compiler/runtime/J9Profiler.cpp#L2940-L2945

I think the following two lines in side the do block are getting pulled out of the loop when they shouldn't:

oldHead = _listHead;
oldPtr = (uintptr_t)oldHead;

The method starts with:

0x900000012f72580 {libj9jit29.so}{addProfileInfo...} +0        e8030000 ld        r0, 0(r3)    // oldPtr (r0) = _listHead (0(r3))
0x900000012f72584 {libj9jit29.so}{addProfileInfo...} +1        60000000 nop
0x900000012f72588 {libj9jit29.so}{addProfileInfo...} +2        60000000 nop
0x900000012f7258c {libj9jit29.so}{addProfileInfo...} +3        60210000 ori       r1, r1, 0

The hang is inside this sequence of assembly:

//Initial values. They don't change inside the small loop.
// $r0 = 0x10023ee8be0 (uintptr_t oldPtr)
// $r3 = 0x1001070fa30 (TR_JProfilerThread receiver object)
// $r4 = 0x10023eed740 (param1 : TR_PersistentProfileInfo *newHead)

0x900000012f72590 {libj9jit29.so}{addProfileInfo...} +4   >    f8040000 std       r0, 0(r4) <<< ^+11   //In the core file 0(r4) has 0x10023ee8be0 as expected
0x900000012f72594 {libj9jit29.so}{addProfileInfo...} +5   |>   7ca018a8 ldarx     r5, 0, r3 <<< ^+9    //$r5 = 0x10027233220, read from _listHead
0x900000012f72598 {libj9jit29.so}{addProfileInfo...} +6   ||   7c202840 cmpld     r0, r5               //compare $r5 to oldPtr ($r0), they are not the same. ($r0 = 0x10023ee8be0)
0x900000012f7259c {libj9jit29.so}{addProfileInfo...} +7   ||*  4082000c bne       0x900000012f725a8 C>> +10 //r5 is not equal to r0 so branch

0x900000012f725a8 {libj9jit29.so}{addProfileInfo...} +10  | >  7c250040 cmpld     r5, r0 <<< +7        //compare $r5 to $r0 again. They still are not the same.
0x900000012f725ac {libj9jit29.so}{addProfileInfo...} +11  *    4082ffe4 bne       0x900000012f72590 C>> ^+4 //r5 is not equal to r0 so branch back up

Each loop iteration is supposed to update oldPtr but it doesn't. $r0 never changes so the loop goes on forever.

IBMJimmyk commented 3 years ago

It looks like the problem is here: https://github.com/eclipse/openj9/blob/345e1b09e2a1f2cf6323b25edc901cce197f4365/runtime/compiler/runtime/J9Profiler.hpp#L865

_listHead can be updated by multiple threads so I think it needs to be marked as volatile.

IBMJimmyk commented 3 years ago

I was able to compile a build of my own with the problem where reading from _listHead was pulled out of the loop. I then tried chaning _listHead to be TR_PersistentProfileInfo * volatile and fixed up a few different code locations to deal with casting issues this caused. A build with this change no longer pulls the load out of the loop which fixes the issue.

I am currently going over nearby code to make sure they use volatile correctly. If no other problems arise, I should be able to make a PR soon.

pshipton commented 3 years ago

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_ppc64_aix_Nightly_testList_2/57 DaaLoadTest_daa2_special_5m_16

IBMJimmyk commented 3 years ago

https://github.com/eclipse/openj9/issues/11526#issuecomment-770213629

That failing test didn't create a corefile but it did come with a javacore. It looks to be the same issue. There's a hang in a compilation thread. Based off the offset into libj9jit29.so it is also in a similar location.

IBMJimmyk commented 3 years ago

I've opened a PR that should fix this Issue. _listHead now gets marked as being volatile and will no longer be pulled out of the loop.

vij-singh commented 3 years ago

@IBMJimmyk Any new updates on this one?

IBMJimmyk commented 3 years ago

The PR to fix the issue is open. I'm currently waiting on the changes to be approved and merged in.

pshipton commented 3 years ago

This was set as a blocker due to the frequency of failure, however the failure hasn't been seen in some time so I'll remove the blocker status.