eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 722 forks source link

AssertionError with testOSMXBeanLocal #12038

Open knn-k opened 3 years ago

knn-k commented 3 years ago

AssertionError "Free Physical Memory size cannot be greater than total Physical Memory Size." with OSMXBeanLocal tests.

Failure link

https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_extended.functional_aarch64_linux/119/ (on build-docker-fedora33-armv8-4) https://ci.adoptopenjdk.net/view/Test_functional/job/Test_openjdk11_j9_extended.functional_aarch64_linux_xl/5/ (on build-docker-ubuntu1804-armv8-4) https://ci.adoptopenjdk.net/view/Test_functional/job/Test_openjdk11_j9_extended.functional_aarch64_linux_xl/4/ (on build-docker-ubuntu1804-armv8-5) https://ci.adoptopenjdk.net/view/Test_functional/job/Test_openjdk11_j9_extended.functional_aarch64_linux_xl/3/ (on build-docker-fedora33-armv8-5) https://ci.adoptopenjdk.net/view/Test_functional/job/Test_openjdk11_j9_extended.functional_aarch64_linux_xl/1/ (on build-docker-ubuntu1804-armv8-5)

Optional info

Failure output (captured from console output)

[TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
[TestOperatingSystemMXBean] [INFO]  Starting the OperatingSystemMXBean usage API tests........
[TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
[TestOperatingSystemMXBean] [INFO] Tests starting at: 1614111474129 ms.
[TestOperatingSystemMXBean] [INFO] Testing getTotalPhysicalMemory() & getFreePhysicalMemorySize() APIs
[TestOperatingSystemMXBean] [INFO] Tests finished at: 1614111474145 ms.
FAILED: runTestOSMXBean
java.lang.AssertionError: Free Physical Memory size cannot be greater than total Physical Memory Size. expected [false] but found [true]
    at org.testng.Assert.fail(Assert.java:96)
    at org.testng.Assert.failNotEquals(Assert.java:776)
    at org.testng.Assert.assertFalse(Assert.java:65)
    at org.openj9.test.management.TestOperatingSystemMXBean.test_PhysicalMemoryAPIs(TestOperatingSystemMXBean.java:263)
    at org.openj9.test.management.TestOperatingSystemMXBean.runTestOSMXBean(TestOperatingSystemMXBean.java:170)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
knn-k commented 3 years ago

The failure was reproduced in a Grinder run on build-docker-ubuntu1804-armv8-3 https://ci.adoptopenjdk.net/job/Grinder/7219/:

[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO]  Starting the OperatingSystemMXBean usage API tests........
[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO] Tests starting at: 1614146376557 ms.
[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO] Testing getTotalPhysicalMemory() & getFreePhysicalMemorySize() APIs
[2021-02-24T05:59:36.737Z] [TestOperatingSystemMXBean] [INFO] Tests finished at: 1614146376571 ms.
[2021-02-24T05:59:37.363Z] FAILED: runTestOSMXBean
[2021-02-24T05:59:37.363Z] java.lang.AssertionError: Free Physical Memory size cannot be greater than total Physical Memory Size. expected [false] but found [true]
[2021-02-24T05:59:37.363Z]  at org.testng.Assert.fail(Assert.java:96)
[2021-02-24T05:59:37.363Z]  at org.testng.Assert.failNotEquals(Assert.java:776)
[2021-02-24T05:59:37.363Z]  at org.testng.Assert.assertFalse(Assert.java:65)
[2021-02-24T05:59:37.363Z]  at org.openj9.test.management.TestOperatingSystemMXBean.test_PhysicalMemoryAPIs(TestOperatingSystemMXBean.java:263)
[2021-02-24T05:59:37.363Z]  at org.openj9.test.management.TestOperatingSystemMXBean.runTestOSMXBean(TestOperatingSystemMXBean.java:170)

I cannot reproduce the failure on my machine using the binary OpenJDK11U-jdk_aarch64_linux_openj9_2021-02-23-18-05.tar.gz from https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-linux-aarch64-openj9/673/.

Environment-dependent problem?

knn-k commented 3 years ago

A run on test-alibaba-ubuntu1804-armv8-1 was successful. https://ci.adoptopenjdk.net/view/Test_functional/job/Test_openjdk11_j9_extended.functional_aarch64_linux_xl/2/

knn-k commented 3 years ago

https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_extended.functional_aarch64_linux/117/ on build-docker-fedora33-armv8-5 was a successful run, but that was before the mixed refs build became the default.

sxa commented 3 years ago

Probably worth us testing a standalone java app that calls those two system calls and test that on various systems - unfortunatley the test isn't printing the values it has receieved so can't be debugged directly from that output. (Are those J9-specific methods being called?)

The question may become "What should it return"? particularly in the case where a docker image is capped on a host system.

knn-k commented 3 years ago

The location of the assertion is https://github.com/eclipse/openj9/blob/4913c6500c08b8e933756f9a05807c7b26c63999/test/functional/JLM_Tests/src/org/openj9/test/management/TestOperatingSystemMXBean.java#L263-L264

The test obtains the values of totalMemory and freeMemory from com.ibm.lang.management.OperatingSystemMXBean.

sxa commented 3 years ago

I'm not at all familiar with the mxbean stuff - is there a simple way to instantiate an OperatingSystemMXBean to call those functions on it?

knn-k commented 3 years ago

I am not familiar, either. Maybe I can try writing code by looking at the testcase.

sxa commented 3 years ago

There are debug statements in the test that print the values out so if we have an easy way to run it in debug mode that might do (if I knew where the debug stuff ends up!)

knn-k commented 3 years ago

Simple code to run:

import java.lang.management.ManagementFactory;

class Issue12038 {
    public static void main(String[] args) {
        com.ibm.lang.management.OperatingSystemMXBean osmxbean =
            (com.ibm.lang.management.OperatingSystemMXBean)ManagementFactory.getOperatingSystemMXBean();
        long totalMemory = osmxbean.getTotalPhysicalMemory();
        long freeMemory = osmxbean.getFreePhysicalMemorySize();
        System.out.println("totalMemory = " + totalMemory);
        System.out.println("freeMemory  = " + freeMemory);
        if (freeMemory > totalMemory) {
            System.out.println("*** Unexpected result ***");
        }
    }
}
sxa commented 3 years ago

Thanks :-) That shows the problem - showing 8Gb total, just under 100Gb free

totalMemory = 8589934592
freeMemory  = 99051855872

The data from /proc/meminfo says:

MemTotal:       131753604 kB
MemFree:        96757992 kB
MemAvailable:   123620564 kB

free says:

[jenkins@da9180a41697 sxa]$ free
              total        used        free      shared  buff/cache   available
Mem:      131753604     6024264    96757196     1137172    28972144   123619844
Swap:       4392300      786280     3606020

Does sound like however it's doing the querying it's getting an inconsistent set of numbers - one based on the size of the container (total memory) and the other not (free memory) - two numbers which are incompatible. I'd suggest this is probably an OpenJ9/OMR bug. Should be reproducible if you run the container with --memory=2G (Example figures above were from it set to 8G as that's what the particular agent was set to but depends how much you want to give to it. I haven't tried on an x64 system yet but I'd imagine it's do the same.

sxa commented 3 years ago

@dinogun I think you did some work on container optimisation for OpenJ9 - would you expect the function calls above to be returning the numbers they are ain restricted container?

lumpfish commented 3 years ago

Seen on the x86-64_linux jdk16+36_openj9-0.25.0-m2 build: https://ci.adoptopenjdk.net/job/Test_openjdk16_j9_extended.functional_x86-64_linux/39/consoleFull

11:28:05  openjdk version "16" 2021-03-16
11:28:05  OpenJDK Runtime Environment AdoptOpenJDK (build 16+36-202103011042)
11:28:05  Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.25.0-m2, JRE 16 Linux amd64-64-Bit Compressed References 20210301_60 (JIT enabled, AOT enabled)
11:28:05  OpenJ9   - 212eb1038
11:28:05  OMR      - 09514431e
11:28:05  JCL      - 27a256c845f based on jdk-16+36)
14:47:04  [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
14:47:04  [TestOperatingSystemMXBean] [INFO]  Starting the OperatingSystemMXBean usage API tests........
14:47:04  [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
14:47:04  [TestOperatingSystemMXBean] [INFO] Tests starting at: 1614610024051 ms.
14:47:04  [TestOperatingSystemMXBean] [INFO] Testing getTotalPhysicalMemory() & getFreePhysicalMemorySize() APIs
14:47:04  [TestOperatingSystemMXBean] [INFO] Tests finished at: 1614610024056 ms.
14:47:04  FAILED: runTestOSMXBean
14:47:04  java.lang.AssertionError: Free Physical Memory size cannot be greater than total Physical Memory Size. expected [false] but found [true]
14:47:04    at org.testng.Assert.fail(Assert.java:96)
14:47:04    at org.testng.Assert.failNotEquals(Assert.java:776)
14:47:04    at org.testng.Assert.assertFalse(Assert.java:65)
14:47:04    at org.openj9.test.management.TestOperatingSystemMXBean.test_PhysicalMemoryAPIs(TestOperatingSystemMXBean.java:263)
14:47:04    at org.openj9.test.management.TestOperatingSystemMXBean.runTestOSMXBean(TestOperatingSystemMXBean.java:170)
14:47:04    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
14:47:04    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
14:47:04    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
14:47:04    at java.base/java.lang.reflect.Method.invoke(Method.java:567)
14:47:04    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
14:47:04    at org.testng.internal.Invoker.invokeMethod(Invoker.java:580)
14:47:04    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716)
14:47:04    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:988)
14:47:04    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
14:47:04    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
14:47:04    at org.testng.TestRunner.privateRun(TestRunner.java:648)
14:47:04    at org.testng.TestRunner.run(TestRunner.java:505)
14:47:04    at org.testng.SuiteRunner.runTest(SuiteRunner.java:455)
14:47:04    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:450)
14:47:04    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:415)
14:47:04    at org.testng.SuiteRunner.run(SuiteRunner.java:364)
14:47:04    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
14:47:04    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
14:47:04    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1208)
14:47:04    at org.testng.TestNG.runSuitesLocally(TestNG.java:1137)
14:47:04    at org.testng.TestNG.runSuites(TestNG.java:1049)
14:47:04    at org.testng.TestNG.run(TestNG.java:1017)
14:47:04    at org.testng.TestNG.privateMain(TestNG.java:1354)
14:47:04    at org.testng.TestNG.main(TestNG.java:1323)
knn-k commented 3 years ago

Removed "AArch64:" from the title.

pshipton commented 3 years ago

@tajila fyi

tajila commented 3 years ago

We use omrvmem_get_available_physical_memory to return osmxbean.getFreePhysicalMemorySize() This uses availiablePages (_SC_AVPHYS_PAGES).

omrsysinfo_get_physical_memory to return osmxbean.getTotalPhysicalMemory(). This looks at cgroups memlimits first (if available), if not it looks at total pages (_SC_PHYS_PAGES).

This confirms @sxa analysis above.

We will need to add a container aware version of omrvmem_get_available_physical_memory to fix this issue.

lumpfish commented 3 years ago

Seen on the x86-64_linux jdk-11.0.11+5_openj9-0.26.0-m1 build.

openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.11+5-202103100202)
Eclipse OpenJ9 VM AdoptOpenJDK (build master-c11f57cfb, JRE 11 AIX ppc64-64-Bit Compressed References 20210309_941 (JIT enabled, AOT enabled)
OpenJ9   - c11f57cfb
OMR      - 631fc1da2
JCL      - 8d7fbfd341 based on jdk-11.0.11+5)
15:49:38  [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
15:49:38  [TestOperatingSystemMXBean] [INFO]  Starting the OperatingSystemMXBean usage API tests........
15:49:38  [TestOperatingSystemMXBean] [INFO]  ----------------------------------------------------------
15:49:38  [TestOperatingSystemMXBean] [INFO] Tests starting at: 1615304978563 ms.
15:49:38  [TestOperatingSystemMXBean] [INFO] Testing getTotalPhysicalMemory() & getFreePhysicalMemorySize() APIs
15:49:38  [TestOperatingSystemMXBean] [INFO] Tests finished at: 1615304978569 ms.
15:49:39  FAILED: runTestOSMXBean
15:49:39  java.lang.AssertionError: Free Physical Memory size cannot be greater than total Physical Memory Size. expected [false] but found [true]
15:49:39    at org.testng.Assert.fail(Assert.java:96)
15:49:39    at org.testng.Assert.failNotEquals(Assert.java:776)
15:49:39    at org.testng.Assert.assertFalse(Assert.java:65)
15:49:39    at org.openj9.test.management.TestOperatingSystemMXBean.test_PhysicalMemoryAPIs(TestOperatingSystemMXBean.java:263)
15:49:39    at org.openj9.test.management.TestOperatingSystemMXBean.runTestOSMXBean(TestOperatingSystemMXBean.java:170)
15:49:39    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
15:49:39    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
15:49:39    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
15:49:39    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
15:49:39    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
15:49:39    at org.testng.internal.Invoker.invokeMethod(Invoker.java:580)
15:49:39    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716)
15:49:39    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:988)
15:49:39    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
15:49:39    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
15:49:39    at org.testng.TestRunner.privateRun(TestRunner.java:648)
15:49:39    at org.testng.TestRunner.run(TestRunner.java:505)
15:49:39    at org.testng.SuiteRunner.runTest(SuiteRunner.java:455)
15:49:39    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:450)
15:49:39    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:415)
15:49:39    at org.testng.SuiteRunner.run(SuiteRunner.java:364)
15:49:39    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
15:49:39    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
15:49:39    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1208)
15:49:39    at org.testng.TestNG.runSuitesLocally(TestNG.java:1137)
15:49:39    at org.testng.TestNG.runSuites(TestNG.java:1049)
15:49:39    at org.testng.TestNG.run(TestNG.java:1017)
15:49:39    at org.testng.TestNG.privateMain(TestNG.java:1354)
15:49:39    at org.testng.TestNG.main(TestNG.java:1323)
15:49:39  
tajila commented 3 years ago

@dinogun Are you able to look at this?

andrew-m-leonard commented 3 years ago

@pshipton we're seeing this failure repeatedly at Adopt, can we get this excluded for the moment?

pshipton commented 3 years ago

Created https://github.com/eclipse/openj9/pull/12420 to exclude the test on xlinux and aarch64. If there are other platforms pls let me know.

andrew-m-leonard commented 3 years ago

@pshipton Seen on pLinux as well: https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_extended.functional_ppc64le_linux/117/testReport/junit/org.openj9.test.management/TestOperatingSystemMXBean/runTestOSMXBean/ Which is a docker machine as well.

pshipton commented 3 years ago

The test is excluded when running on docker. Also in the 0.26.0 release branch.