eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.29k stars 723 forks source link

testSoftMxDisclaimMemory [ERROR] FAIL: Heap can't shrink to the reset max heap size within 5 minutes! #15672

Open JasonFengJ9 opened 2 years ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build(rhel7s390x-3-1):

java version "11.0.16" 2022-07-19
IBM Semeru Runtime Certified Edition 11.0.16.0 (build 11.0.16+8)
Eclipse OpenJ9 VM 11.0.16.0 (build openj9-0.33.0, JRE 11 Linux s390x-64-Bit Compressed References 20220804_350 (JIT enabled, AOT enabled)
OpenJ9   - 04a55b45b
OMR      - b58aa2708
JCL      - 072c5dcdb7 based on jdk-11.0.16+8)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2022-08-05T00:26:55.823Z] variation: Mode501
[2022-08-05T00:26:55.823Z] JVM_OPTIONS:  -Xjit -Xgcpolicy:balanced -Xnocompressedrefs 

[2022-08-05T00:32:10.705Z] [SoftmxAdvanceTest] [ERROR]  FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
[2022-08-05T00:37:10.215Z] [SoftmxAdvanceTest] [ERROR]  FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
[2022-08-05T00:37:11.793Z] [SoftmxAdvanceTest] [ERROR]  FAIL: Memory shouldn't release back to OS! Post memory size 123891712 is greater than previous memory size 76189696
[2022-08-05T00:37:18.971Z] [SoftmxAdvanceTest] [ERROR]  FAIL: Memory shouldn't release back to OS! Post memory size 128049152 is greater than previous memory size 120745984

[2022-08-05T00:42:22.344Z] [SoftmxAdvanceTest] [ERROR]  FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
[2022-08-05T00:42:22.807Z] FAILED: testDisclaimMemoryEffects(5)
[2022-08-05T00:42:22.807Z] java.lang.AssertionError: Test has failed after 5 attempts
[2022-08-05T00:42:22.807Z]  at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
[2022-08-05T00:42:22.807Z]  at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24)
[2022-08-05T00:42:22.807Z]  at j9vm.test.softmx.SoftmxAdvanceTest.testDisclaimMemoryEffects(SoftmxAdvanceTest.java:70)
[2022-08-05T00:42:22.807Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2022-08-05T00:42:22.807Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2022-08-05T00:42:22.807Z]  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2022-08-05T00:42:22.807Z]  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.Invoker.invokeMethod(Invoker.java:580)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:988)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
[2022-08-05T00:42:22.807Z]  at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
[2022-08-05T00:42:22.807Z]  at org.testng.TestRunner.privateRun(TestRunner.java:648)
[2022-08-05T00:42:22.807Z]  at org.testng.TestRunner.run(TestRunner.java:505)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunner.runTest(SuiteRunner.java:455)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:450)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunner.privateRun(SuiteRunner.java:415)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunner.run(SuiteRunner.java:364)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
[2022-08-05T00:42:22.807Z]  at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.runSuitesSequentially(TestNG.java:1208)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.runSuitesLocally(TestNG.java:1137)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.runSuites(TestNG.java:1049)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.run(TestNG.java:1017)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.privateMain(TestNG.java:1354)
[2022-08-05T00:42:22.807Z]  at org.testng.TestNG.main(TestNG.java:1323)
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] ===============================================
[2022-08-05T00:42:22.807Z]     testSoftMxDisclaimMemory
[2022-08-05T00:42:22.807Z]     Tests run: 1, Failures: 1, Skips: 0
[2022-08-05T00:42:22.807Z] ===============================================
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] ===============================================
[2022-08-05T00:42:22.807Z] JLM_Tests_Suite
[2022-08-05T00:42:22.807Z] Total tests run: 1, Failures: 1, Skips: 0
[2022-08-05T00:42:22.807Z] ===============================================
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] 
[2022-08-05T00:42:22.807Z] testSoftMxNotDisclaimMemory_zlinux_64_3_FAILED

Related to https://github.com/eclipse-openj9/openj9/issues/4475

cjjdespres commented 4 months ago

https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal/1075/

# DETECTED_JAVA_VERSION=openjdk version "11.0.24-internal" 2024-07-16
# OpenJDK Runtime Environment (build 11.0.24-internal+0-adhoc..BuildJDK11s390xlinuxjitPersonal)
# Eclipse OpenJ9 VM (build master-abc6fa4ae38, JRE 11 Linux s390x-64-Bit Compressed References 20240711_1485 (JIT enabled, AOT enabled)
# OpenJ9   - abc6fa4ae38
# OMR      - 5e425ebfc86
# JCL      - fa2af551c3d based on jdk-11.0.24+7)
# DETECTED_RELEASE_INFO=IMPLEMENTOR="Eclipse OpenJ9"
# JAVA_RUNTIME_VERSION="11.0.24-internal+0-adhoc..BuildJDK11s390xlinuxjitPersonal"
# JAVA_VERSION="11.0.24"
# JAVA_VERSION_DATE="2024-07-16"
# LIBC="gnu"

# OS_ARCH="s390x"
# OS_NAME="Linux"
# SOURCE="OpenJDK:fa2af551c3d OpenJ9:abc6fa4ae38 OMR:5e425ebfc86"
# 
# Hostname: rhel9s390x-svl-rt10-1.fyre.ibm.com

not ok 165 - testSoftMxNotDisclaimMemory_zlinux_64_3
  ---
    output:
      |
        ===============================================
        ===============================================
        testSoftMxNotDisclaimMemory_zlinux_64_3 Start Time: Thu Jul 11 02:11:17 2024 Epoch Time (ms): 1720689077502
        variation: Mode501
        JVM_OPTIONS: -XX:+UseJITServer -Xjit -Xgcpolicy:balanced -Xnocompressedrefs 

        TEST SETUP:
        JVMSHRC005I No shared class caches available
        JVMSHRC005I No shared class caches available
        cache cleanup done

        TESTING:
        [IncludeExcludeTestAnnotationTransformer] [INFO] EXCLUDE_FILE environment variable: /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/../TestConfig/resources/excludes/latest_exclude_11.txt
        [IncludeExcludeTestAnnotationTransformer] [INFO] Processing exclude file: /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/../TestConfig/resources/excludes/latest_exclude_11.txt
        ...
        ... TestNG 6.14.2 by Cédric Beust (cedric@beust.com)
        ...

        [SoftmxAdvanceTest] [WARN]  Generate Java core dump after forcing GC.
        JVMDUMP034I User requested Java dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.021622.3274543.0001.txt' through com.ibm.jvm.Dump.JavaDump
        JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.021622.3274543.0001.txt
        JVMDUMP034I User requested Heap dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.021622.3274543.0002.phd' through com.ibm.jvm.Dump.HeapDump
        JVMDUMP010I Heap dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.021622.3274543.0002.phd
        JVMDUMP034I User requested System dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.021623.3274543.0003.dmp' through com.ibm.jvm.Dump.SystemDump
        JVMDUMP010I System dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.021623.3274543.0003.dmp
        [SoftmxAdvanceTest] [ERROR]     FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
        [SoftmxAdvanceTest] [WARN]  Generate Java core dump after forcing GC.
        JVMDUMP034I User requested Java dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.022125.3274543.0004.txt' through com.ibm.jvm.Dump.JavaDump
        JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.022125.3274543.0004.txt
        JVMDUMP034I User requested Heap dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.022125.3274543.0005.phd' through com.ibm.jvm.Dump.HeapDump
        JVMDUMP010I Heap dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.022125.3274543.0005.phd
        JVMDUMP034I User requested System dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.022125.3274543.0006.dmp' through com.ibm.jvm.Dump.SystemDump
        JVMDUMP010I System dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.022125.3274543.0006.dmp
        [SoftmxAdvanceTest] [ERROR]     FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
        [SoftmxAdvanceTest] [ERROR]     FAIL: Memory shouldn't release back to OS! Post memory size 1695461376 is greater than previous memory size 1607172096
        [SoftmxAdvanceTest] [ERROR]     FAIL: Memory shouldn't release back to OS! Post memory size 1734246400 is greater than previous memory size 1694412800
        [SoftmxAdvanceTest] [WARN]  Generate Java core dump after forcing GC.
        JVMDUMP034I User requested Java dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.022642.3274543.0007.txt' through com.ibm.jvm.Dump.JavaDump
        JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/javacore.20240711.022642.3274543.0007.txt
        JVMDUMP034I User requested Heap dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.022642.3274543.0008.phd' through com.ibm.jvm.Dump.HeapDump
        JVMDUMP010I Heap dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/heapdump.20240711.022642.3274543.0008.phd
        JVMDUMP034I User requested System dump using '/home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.022642.3274543.0009.dmp' through com.ibm.jvm.Dump.SystemDump
        JVMDUMP010I System dump written to /home/jenkins/workspace/Test_openjdk11_j9_extended.functional_s390x_linux_jit_Personal_testList_1/aqa-tests/TKG/output_17206828973948/testSoftMxNotDisclaimMemory_zlinux_64_3/core.20240711.022642.3274543.0009.dmp
        [SoftmxAdvanceTest] [ERROR]     FAIL: Heap can't shrink to the reset max heap size within 5 minutes!
        FAILED: testDisclaimMemoryEffects(5)
        java.lang.AssertionError: Test has failed after 5 attempts
cjjdespres commented 3 months ago

This has started failing somewhat frequently in the JITServer nightly tests again, e.g. here: https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk8_j9_extended.functional_s390x_linux_jit_Personal_testList_1/1084/ on rhel8s390x-svl-rt5-1.fyre.ibm.com. Normally I'd think that maybe the use of JITServer is making the failure happen more frequently, but this test runs with -Xnocompressedrefs, so the JITServer shouldn't have been in use during these failures.

dmitripivkine commented 3 months ago

This test was unreliable from the very beginning. Please note there are a few aspects of this test:

Another (completely orthogonal) point: please note that there are two Indexable objects headers types selected dynamically run-time now. Standard collectors (Gencon, Optthruput and Opeavgpause) and Metronome use one of them (default) and Balanced uses another one (extended by dataAddr field). It makes code generated for Gencon be incompatible for Balanced. I don't know can this fact be related to these failures.

dmitripivkine commented 3 months ago

And GC verbose log from your build shows requested heap contraction instantly:

  <mem-info id="3711" free="30408704" total="444596224" percent="6">
  <mem-info id="3718" free="30283464" total="444596224" percent="6">
  <mem-info id="3725" free="266338304" total="268435456" percent="99"> <---
  <mem-info id="3733" free="266292280" total="268435456" percent="99">

So, GC works as expected. There are many factors can make this test fail. I think non of them relates to JVM directly (but test running environment).