eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

jdk20 OpenJDK java/util/concurrent/locks/Lock/OOMEInAQS OOM crash vmState=0x00040000 #16659

Closed pshipton closed 8 months ago

pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/1 jdk_util.0 java/util/concurrent/locks/Lock/OOMEInAQS.java

11:34:55  ACTION: main -- Error. Program `/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/openjdkbinary/j2sdk-image/bin/java' timed out (timeout set to 960000ms, elapsed time including timeout handling was 1702367ms).
11:34:55  REASON: User specified action: run main/othervm -XX:-UseGCOverheadLimit -Xmx24M -XX:-UseTLAB OOMEInAQS 
11:34:55  TIME:   1702.467 seconds
11:34:55  messages:
11:34:55  command: main -XX:-UseGCOverheadLimit -Xmx24M -XX:-UseTLAB OOMEInAQS
11:34:55  reason: User specified action: run main/othervm -XX:-UseGCOverheadLimit -Xmx24M -XX:-UseTLAB OOMEInAQS 
11:34:55  started: Thu Feb 02 11:06:28 EST 2023
11:34:55  Mode: othervm [/othervm specified]
11:34:55  Timeout information:
11:34:55  Running jcmd on process 18895
11:34:55  Timeout refired 300 times
11:34:55  Error getting data from 18895: Exception connecting to 18895
11:34:55  Usage : jcmd <vmid> <arguments>
11:34:55  
11:34:55     -J : supply arguments to the Java VM running jcmd
11:34:55     -l : list JVM processes on the local machine
11:34:55     -h : print this help message
11:34:55  
11:34:55     <vmid> : Attach API VM ID as shown in jps or other Attach API-based tools
11:34:55  
11:34:55     arguments:
11:34:55        help : print the list of diagnostic commands
11:34:55        help <command> : print help for the specific command
11:34:55        <command> [command arguments] : command from the list returned by "help"
11:34:55  
11:34:55  list JVM processes on the local machine. Default behavior when no options are specified.
11:34:55  
11:34:55  NOTE: this utility might significantly affect the performance of the target JVM.
11:34:55      The available diagnostic commands are determined by
11:34:55      the target VM and may vary between VMs.
11:34:55  Timeout handler interrupted: 
11:34:55  java.lang.InterruptedException
11:34:55    at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1691)
11:34:55    at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:447)
11:34:55    at jtreg.openj9.CoreDumpTimeoutHandler.runJcmdCommand(CoreDumpTimeoutHandler.java:117)
11:34:55    at jtreg.openj9.CoreDumpTimeoutHandler.runJcmd(CoreDumpTimeoutHandler.java:137)
11:34:55    at jtreg.openj9.CoreDumpTimeoutHandler.runActions(CoreDumpTimeoutHandler.java:55)
11:34:55    at com.sun.javatest.regtest.TimeoutHandler.handleTimeout(TimeoutHandler.java:132)
11:34:55    at com.sun.javatest.regtest.exec.ProcessCommand$2.run(ProcessCommand.java:317)
11:34:55  --- Timeout information end.
11:34:55  finished: Thu Feb 02 11:34:50 EST 2023
11:34:55  elapsed time (seconds): 1702.467
11:34:55  configuration:
11:34:55  STDOUT:
11:34:55  fillHeap time: 5817 millis, whole test time: 79838 millis
11:34:55  Timeout refired 960 times
11:34:55  STDERR:
11:34:55  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2023/02/02 11:06:28 - please wait.
11:34:55  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110628.18895.0001.txt' in response to an event
11:34:55  JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110628.18895.0001.txt
11:34:55  JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110628.18895.0002.trc' in response to an event
11:34:55  JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110628.18895.0002.trc
11:34:55  JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
11:34:55  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2023/02/02 11:06:28 - please wait.
11:34:55  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110628.18895.0003.txt' in response to an event
11:34:55  JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110628.18895.0003.txt
11:34:55  JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110628.18895.0004.trc' in response to an event
11:34:55  JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110628.18895.0004.trc
11:34:55  JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
11:34:55  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2023/02/02 11:06:29 - please wait.
11:34:55  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110629.18895.0005.txt' in response to an event
11:34:55  JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110629.18895.0005.txt
11:34:55  JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110629.18895.0006.trc' in response to an event
11:34:55  JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110629.18895.0006.trc
11:34:55  JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
11:34:55  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2023/02/02 11:06:29 - please wait.
11:34:55  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110629.18895.0007.txt' in response to an event
11:34:55  JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/javacore.20230202.110629.18895.0007.txt
11:34:55  JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110629.18895.0008.trc' in response to an event
11:34:55  JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk20_j9_sanity.openjdk_s390x_linux_OpenJDK20/aqa-tests/TKG/output_16753524039056/jdk_util_0/work/scratch/0/Snap.20230202.110629.18895.0008.trc
11:34:55  JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
11:34:55  Exception in thread "Attach API wait loop" java/lang/OutOfMemoryError: Java heap space
11:34:55    at java/lang/Throwable.printStackTrace (java.base@20-internal/Throwable.java:554)
11:34:55    at java/lang/Throwable.printStackTraceHelper (java.base@20-internal/Throwable.java:338)
11:34:55    at java/lang/Throwable.printStackTrace (java.base@20-internal/Throwable.java:307)
11:34:55    at java/lang/Throwable.printStackTrace (java.base@20-internal/Throwable.java:250)
11:34:55  STATUS:Passed.
11:34:55  Unhandled exception
11:34:55  Type=Segmentation error vmState=0x00040000
11:34:55  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=000000a9 Signal_Code=00000001
11:34:55  Handler1=000003FF9FD444B8 Handler2=000003FF9FC31CF0 InaccessibleAddress=0000000000000000
11:34:55  gpr0=000003FF00000002 gpr1=000003FFA468A1E0 gpr2=00000000000004A0 gpr3=0000000000000000
11:34:55  gpr4=000003FF9FB8C610 gpr5=0000000000000001 gpr6=000000000C957BD0 gpr7=000003FF97B56328
11:34:55  gpr8=00000000000004A0 gpr9=000003FF9FB98360 gpr10=000003FFA026EFA0 gpr11=0000000000000000
11:34:55  gpr12=000003FF24000A28 gpr13=00000000000004A0 gpr14=000003FF9FB8C61A gpr15=000003FF9F9FDC70
11:34:55  psw=000003FFA468A1EA mask=0705200180000000 fpc=0008fe00 bea=000003FF9FB845AC
11:34:55  fpr0 000000000000c010 (f: 49168.000000, d: 2.429222e-319)
11:34:55  fpr1 48dbdc2000000000 (f: 0.000000, d: 9.707746e+42)
11:34:55  fpr2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:34:55  fpr3 3674da7000100000 (f: 1048576.000000, d: 2.282942e-46)
11:34:55  fpr4 000003ff4547b7b0 (f: 1162327936.000000, d: 2.171376e-311)
11:34:55  fpr5 3e92492500000000 (f: 0.000000, d: 2.724785e-07)
11:34:55  fpr6 0000000000100000 (f: 1048576.000000, d: 5.180654e-318)
11:34:55  fpr7 3e3a332500000000 (f: 0.000000, d: 6.100112e-09)
11:34:55  fpr8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:34:55  fpr9 000000000053f3b8 (f: 5501880.000000, d: 2.718290e-317)
11:34:55  fpr10 000003ffa001ee60 (f: 2684481024.000000, d: 2.172128e-311)
11:34:55  fpr11 0000000000c92438 (f: 13182008.000000, d: 6.512777e-317)
11:34:55  fpr12 0005f3b9bf592484 (f: 3210290432.000000, d: 8.277349e-309)
11:34:55  fpr13 000003ff14006fa8 (f: 335572896.000000, d: 2.170967e-311)
11:34:55  fpr14 000000000053f3c0 (f: 5501888.000000, d: 2.718294e-317)
11:34:55  fpr15 000003ff1404dc88 (f: 335862912.000000, d: 2.170968e-311)
11:34:55  Module=/lib64/libpthread.so.0
11:34:55  Module_base_address=000003FFA4680000 Symbol=pthread_mutex_lock
11:34:55  Symbol_address=000003FFA468A1E0
11:34:55  Target=2_90_20230202_1 (Linux 3.10.0-1160.76.1.el7.s390x)
11:34:55  CPU=s390x (4 logical CPUs) (0x1ec1b1000 RAM)
11:34:55  ----------- Stack Backtrace -----------
11:34:55  pthread_mutex_lock+0xa (0x000003FFA468A1EA [libpthread.so.0+0xa1ea])
11:34:55  omrthread_interrupt+0x62 (0x000003FF9FB8C61A [libj9thr29.so+0xc61a])
11:34:55  Java_java_lang_Thread_interruptImpl+0x84 (0x000003FF97B563AC [libjclse29.so+0x563ac])
11:34:55  ffi_call_SYSV+0x54 (0x000003FF9FF17014 [libj9vm29.so+0x217014])
11:34:55  ffi_call+0xa2 (0x000003FF9FF16A5A [libj9vm29.so+0x216a5a])
11:34:55  _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x154c6 (0x000003FF9FDBDD36 [libj9vm29.so+0xbdd36])
11:34:55  bytecodeLoopCompressed+0xce (0x000003FF9FDA885E [libj9vm29.so+0xa885e])
11:34:55  c_cInterpreter+0x64 (0x000003FF9FE91694 [libj9vm29.so+0x191694])
11:34:55  ---------------------------------------
pshipton commented 1 year ago

Run a few more at the same time to increase the chances, this one is running on 3 additional machines. https://openj9-jenkins.osuosl.org/job/Grinder/2298/

pshipton commented 1 year ago

Also run a couple on the rhel8 machines, which aren't used much. We exclude them from testing jdk11+ because some zlib related testing still fails due to hardware de/compression. https://openj9-jenkins.osuosl.org/job/Grinder/2300/

pshipton commented 1 year ago

Recreation in https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/269

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_1/269/openjdk_test_output.tar.gz

pshipton commented 1 year ago

Since we got a recreation I'm stopping the other grinders so the machines can be used in the nightly builds.

tajila commented 1 year ago

Based on the latest results the WaitLoop thread has started.

19:07:49.174781281 *0x1345E00 j9trc.5              Event       Thread started VMthread = 0x1345e00, name = Attach API wait loop, nativeID = 0x3ff2c013d10
19:07:49.174781677  0x1345E00 j9vm.16              Entry      >runJavaThread
19:07:49.174782187  0x1345E00 mt.0                 Entry      >openj9/internal/tools/attach/target/WaitLoop.run()V bytecode method, this = 0xf11aab8
19:07:49.174782291  0x1345E00 mt.18                Event       this: openj9/internal/tools/attach/target/WaitLoop@0xf11aab8 method arguments: ()
19:07:49.174783706  0x1345E00 j9vm.97              Event       Allocated new profiling buffer at 0x3ff24000d10
19:07:49.174784746  0x1345E00 j9vm.682             Entry      >WalkStackFrames - walkThread=0x1345e00 flags=0x2c0000 sp=0x1345bd8 a0=0x1345c00 pc=0x3 literals=0x0 els=0x3ffb5044a10 j2i=0x0
19:07:49.174785101  0x1345E00 j9vm.683             Exit       <WalkStackFrames - walkThread=0x1345e00, rc=0
19:07:49.174792389  0x1345E00 mt.0                 Entry      >openj9/internal/tools/attach/target/WaitLoop.waitForNotification(Z)Lopenj9/internal/tools/attach/target/Attachment; bytecode method, this = 0xf11aab8
19:07:49.174792680  0x1345E00 mt.18                Event       this: openj9/internal/tools/attach/target/WaitLoop@0xf11aab8 method arguments: (true)
19:07:49.174803786  0x1345E00 j9jcl.487            Entry      >Java_com_ibm_tools_attach_javaSE_IPC_waitSemaphoreImpl waiting on semaphore (0x3ff40004400)
19:07:49.174804203  0x1345E00 j9prt.1337           Entry      >j9shsem_wait handle=0x3ff40004400, semset = 0 flag = 0, semid=15597595
19:07:50.541598359 *0x1345E00 j9vm.246             Entry      >dispatchAsyncEvents asyncEventFlags=0x1
19:07:50.541598453  0x1345E00 j9vm.247             Event       call event handler: handlerKey=0 eventHandler=0x3ffb4d1ff48 userData=0x3ffb00284e0
19:07:50.541599960  0x1345E00 j9vm.248             Exit       <dispatchAsyncEvents
19:07:50.541629411  0x1345E00 j9mm.69              Event       Concurrent kickoff, tracesizetarget=5796422 kickoffthreshold=800661 remainingfree=519608
19:07:50.541747301  0x1345E00 j9jcl.285            Event       lockFile exit status=8
19:07:50.541785347  0x1345E00 j9jcl.488            Entry      >Java_com_ibm_tools_attach_javaSE_IPC_closeSemaphoreImpl closing semaphore (0x3ff40004470)
19:07:50.541786732  0x1345E00 j9prt.1341           Entry      >j9shsem_close entered handle=0x3ff40004470, semid=15597595
19:07:50.541789858  0x1345E00 j9prt.70             Exit       <j9shsem_close exit
19:07:50.541789907  0x1345E00 j9jcl.261            Event       Java_com_ibm_tools_attach_javaSE_IPC_closeSemaphoreImpl
19:07:50.541791452  0x1345E00 j9jcl.280            Entry      >openSemaphore entry
19:07:50.541793976  0x1345E00 j9prt.61             Entry      >j9shsem_open entered, semname = _notifier, setSize = 1, permission = 438
19:07:50.541797102  0x1345E00 j9prt.138            Entry      >j9shsem createbaseFile entering with filename /tmp/.com_ibm_tools_attach/_notifier
19:07:50.258037193  0x1345E00 mt.9                 Exit       <openj9/internal/tools/attach/target/WaitLoop.checkReplyAndCreateAttachment()Lopenj9/internal/tools/attach/target/Attachment; bytecode static method
19:07:50.258039365  0x1345E00 mt.28                Event       return value: null
19:07:50.258041612  0x1345E00 mt.6                 Exit       <openj9/internal/tools/attach/target/WaitLoop.waitForNotification(Z)Lopenj9/internal/tools/attach/target/Attachment; bytecode method
19:07:50.258041763  0x1345E00 mt.28                Event       return value: null
19:07:50.258041846  0x1345E00 mt.6                 Exit       <openj9/internal/tools/attach/target/WaitLoop.waitForNotification(Z)Lopenj9/internal/tools/attach/target/Attachment; bytecode method
19:07:50.258041948  0x1345E00 mt.28                Event       return value: null
19:07:50.258066002  0x1345E00 mt.0                 Entry      >openj9/internal/tools/attach/target/WaitLoop.waitForNotification(Z)Lopenj9/internal/tools/attach/target/Attachment; bytecode method, this = 0xe22f288
19:07:50.258066459  0x1345E00 mt.18                Event       this: openj9/internal/tools/attach/target/WaitLoop@0xe22f288 method arguments: (true)
19:07:50.258743185  0x1345E00 j9jcl.487            Entry      >Java_com_ibm_tools_attach_javaSE_IPC_waitSemaphoreImpl waiting on semaphore (0x3ff40004470)
19:07:50.258744501  0x1345E00 j9prt.1337           Entry      >j9shsem_wait handle=0x3ff40004470, semset = 0 flag = 0, semid=15597595

And it doesnt seem the thread shutdown properly before the J9VMthread was freed

pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk20_j9_sanity.openjdk_x86-64_linux_Nightly/44/

tajila commented 1 year ago

New grinder with more VM diagnostics, https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/272/

JasonFengJ9 commented 1 year ago

A similar Segmentation error vmState=0x00040000 JDK8 x86-64_linux(ub-epyc7302p-1s16c-01)

openjdk version "1.8.0_372"
IBM Semeru Runtime Open Edition (build 1.8.0_372-b07)
Eclipse OpenJ9 VM (build master-d4443d666, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20230429_646 (JIT enabled, AOT enabled)
OpenJ9   - d4443d666
OMR      - 7ddb743ea
JCL      - 9837b5b6b based on jdk8u372-b07)

[2023-04-29T04:22:29.303Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode501
[2023-04-29T04:22:29.303Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -Xjit -Xgcpolicy:balanced -Xnocompressedrefs 

[2023-04-29T04:51:10.462Z] TEST: java/lang/ref/SoftReference/Pin.java

[2023-04-29T04:51:10.462Z] STDERR:
[2023-04-29T04:51:10.462Z] Filling array with 128 SoftReferences to blocks of 32768 bytes.
[2023-04-29T04:51:10.462Z] Allowing SoftReferences to be enqueued.
[2023-04-29T04:51:10.462Z] Invoking get() on SoftReferences.
[2023-04-29T04:51:10.462Z] Forcing desperate garbage collection...
[2023-04-29T04:51:10.462Z] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2023/04/29 00:22:46 - please wait.

[2023-04-29T04:51:10.463Z] Got OutOfMemoryError, as expected.
[2023-04-29T04:51:10.463Z] Examining contents of array: 128 empty, 0 full.
[2023-04-29T04:51:10.463Z] STATUS:Passed.
[2023-04-29T04:51:10.463Z] Unhandled exception
[2023-04-29T04:51:10.463Z] Type=Segmentation error vmState=0x00040000
[2023-04-29T04:51:10.463Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
[2023-04-29T04:51:10.463Z] Handler1=00007F6B7B8E2D60 Handler2=00007F6B7B63DB00 InaccessibleAddress=00000000000004B0
[2023-04-29T04:51:10.463Z] RDI=00000000000004A0 RSI=0000000000000001 RAX=0000000000000000 RBX=0000000000000000
[2023-04-29T04:51:10.463Z] RCX=0000000000000000 RDX=0000000000000000 R8=00007F6AA000E708 R9=00007F6B7B9F05AD
[2023-04-29T04:51:10.463Z] R10=0000000000000000 R11=00007F6B78532BE0 R12=00007F6AA000E708 R13=00000000000004A0
[2023-04-29T04:51:10.463Z] R14=00007F6B2CBFF640 R15=00007F6B7BA76374
[2023-04-29T04:51:10.463Z] RIP=00007F6B80BF9FC4 GS=0000 FS=0000 RSP=00007F6B2CBFF3E8
[2023-04-29T04:51:10.463Z] EFlags=0000000000010202 CS=0033 RBP=00007F6B7B611500 ERR=0000000000000004
[2023-04-29T04:51:10.463Z] TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=00000000000004B0
[2023-04-29T04:51:10.463Z] xmm0 00007f6b7c4dcb00 (f: 2085473024.000000, d: 6.921841e-310)
[2023-04-29T04:51:10.463Z] xmm1 00007f6b7c05b420 (f: 2080748544.000000, d: 6.921841e-310)
[2023-04-29T04:51:10.463Z] xmm2 00007f6b2cbff910 (f: 750778624.000000, d: 6.921775e-310)
[2023-04-29T04:51:10.463Z] xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm4 00007f6b7c3c6160 (f: 2084331904.000000, d: 6.921841e-310)
[2023-04-29T04:51:10.463Z] xmm5 00007f6a98000080 (f: 2550136832.000000, d: 6.921652e-310)
[2023-04-29T04:51:10.463Z] xmm6 00007f6b7bca1f80 (f: 2076843904.000000, d: 6.921841e-310)
[2023-04-29T04:51:10.463Z] xmm7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm8 6c636a62696c2f74 (f: 1768697728.000000, d: 1.307245e+214)
[2023-04-29T04:51:10.463Z] xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-04-29T04:51:10.463Z] Module=/lib/x86_64-linux-gnu/libpthread.so.0
[2023-04-29T04:51:10.463Z] Module_base_address=00007F6B80BEF000 Symbol=pthread_mutex_lock
[2023-04-29T04:51:10.463Z] Symbol_address=00007F6B80BF9FC0
[2023-04-29T04:51:10.463Z] Target=2_90_20230429_646 (Linux 5.4.0-146-generic)
[2023-04-29T04:51:10.463Z] CPU=amd64 (32 logical CPUs) (0x1f6f3ef000 RAM)
[2023-04-29T04:51:10.463Z] ----------- Stack Backtrace -----------
[2023-04-29T04:51:10.463Z] pthread_mutex_lock+0x4 (0x00007F6B80BF9FC4 [libpthread.so.0+0xafc4])
[2023-04-29T04:51:10.463Z] omrthread_interrupt+0x51 (0x00007F6B7B4071E1 [libj9thr29.so+0xf1e1])
[2023-04-29T04:51:10.463Z] Java_java_lang_Thread_interruptImpl+0x7d (0x00007F6B78532C5D [libjclse29.so+0x4ac5d])
[2023-04-29T04:51:10.463Z] ffi_call_unix64+0x52 (0x00007F6B7BA4D1BA [libj9vm29.so+0x1a91ba])
[2023-04-29T04:51:10.463Z] ffi_call_int+0x1fc (0x00007F6B7BA4B49C [libj9vm29.so+0x1a749c])
[2023-04-29T04:51:10.463Z] _ZN26VM_BytecodeInterpreterFull3runEP10J9VMThread+0xf547 (0x00007F6B7B95A047 [libj9vm29.so+0xb6047])
[2023-04-29T04:51:10.463Z] bytecodeLoopFull+0xc9 (0x00007F6B7B94AAE9 [libj9vm29.so+0xa6ae9])
[2023-04-29T04:51:10.463Z]  (0x00007F6B7B9DAFB2 [libj9vm29.so+0x136fb2])
[2023-04-29T04:51:10.463Z] ---------------------------------------
[2023-04-29T04:51:10.463Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2023/04/29 00:22:46 - please wait.

[2023-04-29T04:51:10.463Z] TEST RESULT: Error. Program `/home/jenkins/workspace/Test_openjdk8_j9_sanity.openjdk_x86-64_linux_testList_0/openjdkbinary/j2sdk-image/bin/java' timed out (timeout set to 960000ms, elapsed time including timeout handling was 1704892ms).
[2023-04-29T04:51:10.463Z] --------------------------------------------------
[2023-04-29T04:51:10.463Z] Test results: passed: 435; error: 1
[2023-04-29T04:51:12.635Z] Report written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.openjdk_x86-64_linux_testList_0/aqa-tests/TKG/output_16827421063967/jdk_lang_j9_0/report/html/report.html
[2023-04-29T04:51:12.635Z] Results written to /home/jenkins/workspace/Test_openjdk8_j9_sanity.openjdk_x86-64_linux_testList_0/aqa-tests/TKG/output_16827421063967/jdk_lang_j9_0/work
[2023-04-29T04:51:12.635Z] Error: Some tests failed or other problems occurred.
[2023-04-29T04:51:12.635Z] -----------------------------------
[2023-04-29T04:51:12.635Z] jdk_lang_j9_0_FAILED

50x grinder - all passed

tajila commented 1 year ago

rebuild of previous grinder https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/274/

tajila commented 1 year ago

rebuild of previous grinder, https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/278/

tajila commented 1 year ago

rebuild https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/280/

pshipton commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk20_j9_sanity.openjdk_ppc64_aix_OMR/1

tajila commented 1 year ago

Seems like it doesnt fail with the latest diagnostic changes.

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/283/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2325/

pshipton commented 1 year ago

Doesn't seem like a 20.0.1 blocker, moving forward.

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2326/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2327/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2330/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2334

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2350/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2352/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2383/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2386/

tajila commented 1 year ago

https://openj9-jenkins.osuosl.org/job/Grinder/2412/

pshipton commented 1 year ago

This continues to fail but I've stopped reporting the failures.

tajila commented 1 year ago

Attempting to reproduce it again, https://openj9-jenkins.osuosl.org/job/Grinder/2966/

pshipton commented 1 year ago

I still see it fairly often in the builds, I can report the next few if needed.

tajila commented 1 year ago

that would be great thanks

JasonFengJ9 commented 1 year ago

A recent failure JDK21 s390x_linux(sles15s390x-svl-rt7-1)

java version "21-beta" 2023-09-19
IBM Semeru Runtime Certified Edition 21+35-202310100139 (build 21-beta+35-202310100139)
Eclipse OpenJ9 VM 21+35-202310100139 (build master-4e1d1c690, JRE 21 Linux s390x-64-Bit Compressed References 20231010_8 (JIT enabled, AOT enabled)
OpenJ9   - 4e1d1c690
OMR      - 83cb59850
JCL      - 14ae7a4f3 based on jdk-21+35)

[2023-10-10T03:40:36.634Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache Mode650
[2023-10-10T03:40:36.634Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-UseCompressedOops 

[2023-10-10T04:12:01.275Z] TEST: java/util/concurrent/locks/Lock/OOMEInAQS.java

[2023-10-10T04:12:01.276Z] Unhandled exception
[2023-10-10T04:12:01.276Z] Type=Segmentation error vmState=0x00040000
[2023-10-10T04:12:01.276Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
[2023-10-10T04:12:01.276Z] Handler1=000003FFA6B49948 Handler2=000003FFA6A319D8 InaccessibleAddress=0000000000000000
[2023-10-10T04:12:01.276Z] gpr0=000003FFA687DC80 gpr1=000003FFA760C2F0 gpr2=00000000000004A0 gpr3=000003FF00000000
[2023-10-10T04:12:01.276Z] gpr4=000003FFA698CA60 gpr5=00000000000000A4 gpr6=000003FF9D926350 gpr7=000003FFA46D8548
[2023-10-10T04:12:01.276Z] gpr8=00000000000004A0 gpr9=000003FFA6997FA0 gpr10=000003FFA03F7FB0 gpr11=0000000000000000
[2023-10-10T04:12:01.276Z] gpr12=000003FF1C000CC8 gpr13=000003FFA6D9DA34 gpr14=000003FFA698CA6A gpr15=000003FFA687DA68
[2023-10-10T04:12:01.276Z] psw=000003FFA760C2F0 mask=0705200180000000 fpc=00080000 bea=000003FFA6984A64
[2023-10-10T04:12:01.276Z] fpr0 000000000000c010 (f: 49168.000000, d: 2.429222e-319)
[2023-10-10T04:12:01.276Z] fpr1 000003ff1c000ae8 (f: 469764832.000000, d: 2.171034e-311)
[2023-10-10T04:12:01.276Z] fpr2 00000000003c0018 (f: 3932184.000000, d: 1.942757e-317)
[2023-10-10T04:12:01.276Z] fpr3 000003ffa786ed18 (f: 2810637568.000000, d: 2.172190e-311)
[2023-10-10T04:12:01.276Z] fpr4 0000000100000000 (f: 0.000000, d: 2.121996e-314)
[2023-10-10T04:12:01.276Z] fpr5 3f336e21dbeabbfe (f: 3689593856.000000, d: 2.964814e-04)
[2023-10-10T04:12:01.276Z] fpr6 000003ff494597aa (f: 1229297536.000000, d: 2.171409e-311)
[2023-10-10T04:12:01.276Z] fpr7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-10T04:12:01.276Z] fpr8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-10T04:12:01.276Z] fpr9 00000000017f9228 (f: 25137704.000000, d: 1.241968e-316)
[2023-10-10T04:12:01.276Z] fpr10 000003ffa683f000 (f: 2793664512.000000, d: 2.172182e-311)
[2023-10-10T04:12:01.276Z] fpr11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-10T04:12:01.276Z] fpr12 00060754830a4230 (f: 2198487552.000000, d: 8.383846e-309)
[2023-10-10T04:12:01.276Z] fpr13 000003ff2003a948 (f: 537110848.000000, d: 2.171067e-311)
[2023-10-10T04:12:01.276Z] fpr14 000003fffc7fb314 (f: 4236227328.000000, d: 2.172895e-311)
[2023-10-10T04:12:01.276Z] fpr15 000003ff2007c868 (f: 537380992.000000, d: 2.171067e-311)
[2023-10-10T04:12:01.276Z] Module=/lib64/libpthread.so.0
[2023-10-10T04:12:01.276Z] Module_base_address=000003FFA7600000 Symbol=pthread_mutex_lock
[2023-10-10T04:12:01.276Z] Symbol_address=000003FFA760C2F0
[2023-10-10T04:12:01.276Z] Target=2_90_20231010_8 (Linux 5.14.21-150500.55.19-default)
[2023-10-10T04:12:01.276Z] CPU=s390x (4 logical CPUs) (0x1e7dc9000 RAM)
[2023-10-10T04:12:01.276Z] ----------- Stack Backtrace -----------
[2023-10-10T04:12:01.276Z] pthread_mutex_lock+0x0 (0x000003FFA760C2F0 [libpthread.so.0+0xc2f0])
[2023-10-10T04:12:01.276Z] omrthread_interrupt+0x62 (0x000003FFA698CA6A [libj9thr29.so+0xca6a])
[2023-10-10T04:12:01.276Z] Java_java_lang_Thread_interruptImpl+0xb6 (0x000003FFA46D85FE [libjclse29.so+0x585fe])
[2023-10-10T04:12:01.276Z] ffi_call_SYSV+0x54 (0x000003FFA6D9DA34 [libj9vm29.so+0x29da34])
[2023-10-10T04:12:01.276Z] ffi_call+0xa2 (0x000003FFA6D9D48A [libj9vm29.so+0x29d48a])
[2023-10-10T04:12:01.276Z] _ZN26VM_BytecodeInterpreterFull3runEP10J9VMThread+0x13864 (0x000003FFA6BF950C [libj9vm29.so+0xf950c])
[2023-10-10T04:12:01.276Z] bytecodeLoopFull+0xee (0x000003FFA6BE5C96 [libj9vm29.so+0xe5c96])
[2023-10-10T04:12:01.276Z] c_cInterpreter+0x64 (0x000003FFA6D16C24 [libj9vm29.so+0x216c24])
[2023-10-10T04:12:01.276Z] ---------------------------------------
[2023-10-10T04:12:01.276Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2023/10/09 20:45:47 - please wait.

[2023-10-10T04:12:01.276Z] Test results: passed: 947; error: 1
[2023-10-10T04:12:07.665Z] Report written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_testList_1/aqa-tests/TKG/output_16969066497615/jdk_util_1/report/html/report.html
[2023-10-10T04:12:07.665Z] Results written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_testList_1/aqa-tests/TKG/output_16969066497615/jdk_util_1/work
[2023-10-10T04:12:07.665Z] Error: Some tests failed or other problems occurred.
[2023-10-10T04:12:07.665Z] -----------------------------------
[2023-10-10T04:12:07.665Z] jdk_util_1_FAILED
tajila commented 1 year ago

re-run https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/35413/

pshipton commented 1 year ago

One here https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_x86-64_windows_Nightly_testList_2/79/ which is an OOM.

tajila commented 1 year ago

The JDK21 changes have been merged. I can't confirm that it fixes the issues given that it is quite intermittent.

pshipton commented 1 year ago

I'll close it and then we can reopen if it's seen again.

pshipton commented 10 months ago

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/133 jdk_util_0 java/util/concurrent/locks/Lock/OOMEInAQS.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/133/openjdk_test_output.tar.gz

21:04:13  ACTION: main -- Error. Program `/home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/jdkbinary/j2sdk-image/bin/java' timed out: timeout handler did not complete within its own timeout. (timeout set to 960000ms, elapsed time including timeout handling was 1447340ms).
21:04:13  REASON: User specified action: run main/othervm -XX:-UseGCOverheadLimit -Xmx48M -XX:-UseTLAB OOMEInAQS 
21:04:13  TIME:   1447.382 seconds
21:04:13  messages:
21:04:13  command: main -XX:-UseGCOverheadLimit -Xmx48M -XX:-UseTLAB OOMEInAQS
21:04:13  reason: User specified action: run main/othervm -XX:-UseGCOverheadLimit -Xmx48M -XX:-UseTLAB OOMEInAQS 
21:04:13  started: Fri Jan 05 01:40:00 UTC 2024
21:04:13  Mode: othervm [/othervm specified]
21:04:13  Timeout information:
21:04:13  Running jcmd on process 1588770
21:04:13  Timeout signalled after 300 seconds
21:04:13  Error getting data from 1588770: Exception connecting to 1588770
21:04:13  Usage : jcmd <vmid> <arguments>
21:04:13  
21:04:13     -J : supply arguments to the Java VM running jcmd
21:04:13     -l : list JVM processes on the local machine
21:04:13     -h : print this help message
21:04:13  
21:04:13     <vmid> : Attach API VM ID as shown in jps or other Attach API-based tools
21:04:13  
21:04:13     arguments:
21:04:13        help : print the list of diagnostic commands
21:04:13        help <command> : print help for the specific command
21:04:13        <command> [command arguments] : command from the list returned by "help"
21:04:13  
21:04:13  list JVM processes on the local machine. Default behavior when no options are specified.
21:04:13  
21:04:13  NOTE: this utility might significantly affect the performance of the target JVM.
21:04:13      The available diagnostic commands are determined by
21:04:13      the target VM and may vary between VMs.
21:04:13  Error getting data from 1588770: Exception connecting to 1588770
21:04:13  Usage : jcmd <vmid> <arguments>
21:04:13  
21:04:13     -J : supply arguments to the Java VM running jcmd
21:04:13     -l : list JVM processes on the local machine
21:04:13     -h : print this help message
21:04:13  
21:04:13     <vmid> : Attach API VM ID as shown in jps or other Attach API-based tools
21:04:13  
21:04:13     arguments:
21:04:13        help : print the list of diagnostic commands
21:04:13        help <command> : print help for the specific command
21:04:13        <command> [command arguments] : command from the list returned by "help"
21:04:13  
21:04:13  list JVM processes on the local machine. Default behavior when no options are specified.
21:04:13  
21:04:13  NOTE: this utility might significantly affect the performance of the target JVM.
21:04:13      The available diagnostic commands are determined by
21:04:13      the target VM and may vary between VMs.
21:04:13  Timeout handler interrupted: 
21:04:13  java.lang.InterruptedException
21:04:13    at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1691)
21:04:13    at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:425)
21:04:13    at jtreg.openj9.CoreDumpTimeoutHandler.runJcmdCommand(CoreDumpTimeoutHandler.java:117)
21:04:13    at jtreg.openj9.CoreDumpTimeoutHandler.runJcmd(CoreDumpTimeoutHandler.java:138)
21:04:13    at jtreg.openj9.CoreDumpTimeoutHandler.runActions(CoreDumpTimeoutHandler.java:55)
21:04:13    at com.sun.javatest.regtest.TimeoutHandler.handleTimeout(TimeoutHandler.java:132)
21:04:13    at com.sun.javatest.regtest.exec.ProcessCommand$2.run(ProcessCommand.java:317)
21:04:13  --- Timeout information end.
21:04:13  finished: Fri Jan 05 02:04:07 UTC 2024
21:04:13  elapsed time (seconds): 1447.382
21:04:13  configuration:
21:04:13  STDOUT:
21:04:13  Timeout signalled after 960 seconds
21:04:13  fillHeap time: 72683 millis, whole test time: 1033768 millis
21:04:13  STDERR:
21:04:13  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2024/01/05 01:40:03 - please wait.
21:04:13  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/aqa-tests/TKG/output_17044133151937/jdk_util_0/work/scratch/2/javacore.20240105.014003.1588770.0001.txt' in response to an event
pshipton commented 10 months ago

In the 0.43 release build https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Release_testList_1/10/

tajila commented 10 months ago

The test forces OOME's so the following is expected:

19:24:18  JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Release_testList_1/aqa-tests/TKG/output_17055286614125/jdk_util_0/work/scratch/0/javacore.20240118.000034.2988659.0001.txt' in response to an event
19:24:18  JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Release_testList_1/aqa-tests/TKG/output_17055286614125/jdk_util_0/work/scratch/0/javacore.20240118.000034.2988659.0001.txt
19:24:18  JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Release_testList_1/aqa-tests/TKG/output_17055286614125/jdk_util_0/work/scratch/0/Snap.20240118.000034.2988659.0002.trc' in response to an event
19:24:18  JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Release_testList_1/aqa-tests/TKG/output_17055286614125/jdk_util_0/work/scratch/0/Snap.20240118.000034.2988659.0002.trc
19:24:18  JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
19:24:18  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2024/01/18 00:00:41 - please wait.

I think the only issue is that it is running slow, likely due to excessive GC's, which causes the timeout. The timeout handler is also slow for the same reasons.

This is an extreme test case, so I dont think it should block the release.

pshipton commented 9 months ago

@TobiAjila we can't have this test intermittently failing all the time. Should we exclude it or can it be fixed to be less aggressive?

tajila commented 9 months ago

The test is designed to stress OOM paths. I think now that we've fixed the segfault we can disable this test. Im not sure how to modify it to be less agressive.

tajila commented 9 months ago

Ill open a PR to exclude it

pshipton commented 8 months ago

I suppose this is perm excluded. Changing the label and closing.