eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

openJcePlusTests_0_FAILED ibm.jceplus.junit.openjceplus.multithread.TestAES_128 detail "java/lang/OutOfMemoryError", exception "native memory exhausted" #19887

Open JasonFengJ9 opened 1 month ago

JasonFengJ9 commented 1 month ago

Failure link

From an internal build(paix825):

java version "21.0.4-beta" 2024-07-16
IBM Semeru Runtime Certified Edition 21.0.4+6-202407180209 (build 21.0.4-beta+6-202407180209)
Eclipse OpenJ9 VM 21.0.4+6-202407180209 (build master-0e944e75c8, JRE 21 AIX ppc64-64-Bit Compressed References 20240718_216 (JIT enabled, AOT enabled)
OpenJ9   - 0e944e75c8
OMR      - de311dd2b
JCL      - 4f7da7b3b based on jdk-21.0.4+6)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2024-07-18T19:24:08.242Z] variation: NoOptions
[2024-07-18T19:24:08.242Z] JVM_OPTIONS:  

[2024-07-18T19:25:07.405Z]      [test]     [junit] Test calling: ibm.jceplus.junit.openjceplus.multithread.TestAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:07.405Z]      [test]     [junit] executing testAES_128
[2024-07-18T19:25:20.734Z]      [test]     [junit] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2024/07/18 15:25:18 - please wait.

[2024-07-18T19:25:44.697Z]      [test]     [junit] Test finished: ibm.jceplus.junit.openjceplus.multithread.TestAES_128

[2024-07-18T19:25:47.016Z] openJcePlusTests_0_FAILED

50x internal grinder - 5 failures reproduced along with a few others.

JasonFengJ9 commented 1 month ago

Also seen at JDK22 ppc64_aix 0.46 rc1

openjdk version "22.0.2" 2024-07-16
IBM Semeru Runtime Open Edition 22.0.2.0-rc1 (build 22.0.2+9)
Eclipse OpenJ9 VM 22.0.2.0-rc1 (build v0.46.0-release-6c99fa94be, JRE 22 AIX ppc64-64-Bit Compressed References 20240716_17 (JIT enabled, AOT enabled)
OpenJ9   - 6c99fa94be
OMR      - 840a9adba
JCL      - cfe2b5689d1 based on jdk-22.0.2+9)
[2024-07-22T15:14:18.640Z]      [test]     [junit] executing testAES_128
[2024-07-22T15:15:14.231Z]      [test]     [junit] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2024/07/22 11:15:07 - please wait.

[2024-07-22T15:15:53.579Z] openJcePlusTests_0_FAILED
pshipton commented 1 month ago

@KostasTsiounis pls help find someone to take a look at this failure, we need to determine if a fix is required in 0.46.

pshipton commented 1 month ago

Related to https://github.com/eclipse-openj9/openj9/issues/19642?

KostasTsiounis commented 1 month ago

This is the same test, but a different error. https://github.com/eclipse-openj9/openj9/issues/19642 was a segmentation fault, while this is an out of memory, more specifically java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11.

We have observed similar failures in a memory issue that was originally reported in Java 8 (that involves the test that was running MessageDigest.getInstance() by multiple threads in a tight loop to which we added a explicit call to the GC). It looks like the fact that we are running so many threads leads to the GC being overwhelmed and eventually running out of memory. AIX seems to be affected more.

I don't think this is a blocker for 0.46. I am already working on a solution to replace finalizers with Cleaners, but it will most likely be ready for the next release and I am not exactly sure how much it will help eventually.

I will look further look into it, but I don't think this a common use case where that many threads run so many iterations of algorithms. Plus, we haven't encountered in other platforms, unless we crank up the threads to 20 or more per test.

jasonkatonica commented 1 month ago

Most likely this should be fixed due to a overlay that was found in https://github.com/eclipse-openj9/openj9/issues/19642. I believe that this issue could be closed until we see this occur again.

JasonFengJ9 commented 1 month ago

Closing as per https://github.com/eclipse-openj9/openj9/issues/19887#issuecomment-2264000945

JasonFengJ9 commented 1 month ago

JDK11 ppc64_aix(pll011)

java version "11.0.25-beta" 2024-10-15
IBM Semeru Runtime Certified Edition 11.0.25+1-202408052207 (build 11.0.25-beta+1-202408052207)
Eclipse OpenJ9 VM 11.0.25+1-202408052207 (build master-27dac39d2c, JRE 11 AIX ppc64-64-Bit Compressed References 20240805_902 (JIT enabled, AOT enabled)
OpenJ9   - 27dac39d2c
OMR      - 9ccff3cf2
JCL      - f10542c9cc based on jdk-11.0.25+1)

[2024-08-06T00:19:44.073Z] variation: NoOptions
[2024-08-06T00:19:44.073Z] JVM_OPTIONS:  

[2024-08-06T00:20:08.817Z]      [test]     [junit] executing testAES_128
[2024-08-06T00:20:20.276Z]      [test]     [junit] JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2024/08/05 20:20:19 - please wait.
[2024-08-06T00:20:20.276Z]      [test]     [junit] JVMDUMP032I JVM requested System dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/core.20240805.202019.20971788.0001.dmp' in response to an event
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP010I System dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/core.20240805.202019.20971788.0001.dmp
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP032I JVM requested Heap dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/heapdump.20240805.202019.20971788.0002.phd' in response to an event
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP016I Aborting: Cannot create file (/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/heapdump.20240805.202019.20971788.0002.phd)
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/javacore.20240805.202019.20971788.0003.txt' in response to an event
[2024-08-06T00:21:12.571Z]      [test]     [junit] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 74.347 sec
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/javacore.20240805.202019.20971788.0003.txt
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP032I JVM requested Snap dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/Snap.20240805.202019.20971788.0004.trc' in response to an event
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP010I Snap dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/Snap.20240805.202019.20971788.0004.trc
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMTRCE010W Native memory allocation failure, falling back to nodynamic trace settings.
[2024-08-06T00:21:12.571Z]      [test]     [junit] JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
[2024-08-06T00:21:12.571Z]      [test] 
[2024-08-06T00:21:12.571Z]      [test] BUILD FAILED
[2024-08-06T00:21:12.571Z]      [test] /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/test.xml:69: Test ibm.jceplus.junit.TestMultithread failed
[2024-08-06T00:21:12.571Z]      [test] 
[2024-08-06T00:21:12.571Z]      [test] Total time: 1 minute 25 seconds
[2024-08-06T00:21:12.571Z] 
[2024-08-06T00:21:12.571Z] BUILD FAILED
[2024-08-06T00:21:12.571Z] /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_ppc64_aix_testList_0/aqa-tests/TKG/output_17229026382186/openJcePlusTests_0/test.xml:35: Java returned: 1
[2024-08-06T00:21:12.571Z] 
[2024-08-06T00:21:12.571Z] Total time: 1 minute 27 seconds
[2024-08-06T00:21:12.571Z] -----------------------------------
[2024-08-06T00:21:12.571Z] openJcePlusTests_0_FAILED

50x grinder - reproduced once, other failures are https://github.com/eclipse-openj9/openj9/issues/19981.

This failure appeared again in yesterday's build, and re-open.