eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Memory management is not correct when using cgroup v2 #14190

Closed pshipton closed 2 years ago

pshipton commented 2 years ago

See https://github.com/ibmruntimes/ci.docker/issues/124

pshipton commented 2 years ago

@tajila fyi

tajila commented 2 years ago

@babsingh Can you take a look at this one

sharanpatil123 commented 2 years ago

@babsingh I have got an internal sev1 case with the same issue.can you please have a look at it?

babsingh commented 2 years ago

Yes, I will take a look.

babsingh commented 2 years ago

As per https://github.com/eclipse/omr/issues/1281, we only added support for cgroup v1 in OMR: https://github.com/eclipse/omr/pull/1310. Support for cgroup v2 is still pending. @ashu-mehra Can you point us to any resources (design docs, unmerged code, etc.) which can be used to support cgroup v2 in OMR?

ashu-mehra commented 2 years ago

@babsingh I didn't get any chance to explore cgroup v2 mechanism. I did create an issue for it but due to time constraint it never got worked on. The best resource I can refer to would be the man page for cgroups which has description for both versions and take it from there.

DanHeidinga commented 2 years ago

@babsingh can you make a clear statement about the workaround?

babsingh commented 2 years ago
babsingh commented 2 years ago

In JDK17+ extensions repo, there is a Java API which supports Cgroup V1 and V2: https://github.com/ibmruntimes/openj9-openjdk-jdk17/tree/openj9/src/java.base/linux/classes/jdk/internal/platform. There is a possibility to backport this API to Java 8 and Java 11 if this API can replace uses of OMR Cgroup native API within OpenJ9.

This Java API has only one native dependency: CgroupMetrics.isUseContainerSupport -> JVM_IsUseContainerSupport.

OpenJ9 should be able to use this API since it has its own implementation of JVM_IsUseContainerSupport.

OpenJ9 uses the OMR Cgroup native API in the following locations:

1. runtime/rasdump/javadump.cpp (writeCgroupMetrics)
    OMR Cgroup functions used:
        - omrsysinfo_cgroup_get_enabled_subsystems
        - omrsysinfo_cgroup_is_system_available
        - omrsysinfo_get_cgroup_subsystem_list
        - omrsysinfo_cgroup_subsystem_iterator_init
        - omrsysinfo_cgroup_subsystem_iterator_hasNext
        - omrsysinfo_cgroup_subsystem_iterator_metricKey
        - omrsysinfo_cgroup_subsystem_iterator_destroy

    OMR Cgroup data structs used:
        - OMRCgroupEntry
        - OMRCgroupMetricIteratorState
        - OMRCgroupMetricElement

2. runtime/gc_base/GCExtensions.cpp
    OMR Cgroup functions used:
        - omrsysinfo_cgroup_are_subsystems_enabled
        - omrsysinfo_cgroup_is_memlimit_set

3. runtime/vm/jvminit.c
    OMR Cgroup functions used:
        - omrsysinfo_cgroup_enable_subsystems
        - omrsysinfo_cgroup_get_available_subsystems

4. runtime/compiler/control/CompilationThread.cpp
    OMR Cgroup functions used:
        - omrsysinfo_cgroup_are_subsystems_enabled (OMR_CGROUP_SUBSYSTEM_MEMORY)

Can we replace the OMR Cgroup native API with the extensions repo Cgroup Java API?

Invoking the extensions repo Cgroup Java API in the above OpenJ9 locations may not be possible because the above locations won't allow execution of Java code through call-ins. Also, it may not be feasible to translate from the OMR Cgroup native API to the extensions repo Cgroup Java API (missing functionality). So, our best bet is to update the OMR Cgroup native API to support Cgroup v2.

pshipton commented 2 years ago

Not sure if the previous is actually a question or not, it's arrived at the right conclusion. The usage in jvminit.c occurs during option processing before java code is running. Similarly for GC and probably for JIT as well.

babsingh commented 2 years ago

From @mpirvu:

I discovered that published Semeru container images do not load the embedded AOT code because of various AOT incompatibility issues. This happens because the AOT code is not in "portable" format. This is most likely due to the fact that Semeru images were built in an environment with cgroups v2 and OpenJ9 is not able to detect that is running in containers. I used a machine with cgroups v2 and generated a javacore while running in containers. The javacore showed:

1CICONTINFO    Running in container : FALSE
1CICGRPINFO    JVM support for cgroups enabled : FALSE

I am raising the priority of this issue since many customers running OpenJ9 in containers will not be able to use AOT and experience a start-up slowdown. As I understand, we do not have much control on the dockerhub process/environment that builds the images.

There is a temporary workaround for this issue: https://github.com/eclipse-openj9/openj9/issues/14190#issuecomment-1010122701. Docker containers can be switched back to cgroup v1 since both cgroup v1 and v2 should be available in newer containers.

The problem is that we don't control the process of building the Semeru images. Once those images have been produced, they contain non-portable AOT.

babsingh commented 2 years ago

@mpirvu This work is being given high priority. We should have minimal support by April. This should address customer issues.

Global tracker for the OMR implementation

https://github.com/eclipse/omr/issues/1281

EricYangIBM commented 2 years ago

For the linked docker issue, is the Max. Heap Size (Estimated): 512.00M being calculated from j9gc_get_maximum_heap_size? Where memoryMax is calculated at https://github.com/eclipse/omr/blob/644b9078b90a441a73da0ab9da66dc60d283033a/gc/base/GCExtensionsBase.cpp#L305-L324 and is assigned a maximum of 512M, then in MM_GCExtensions::computeDefaultMaxHeapForJava gets expanded if cgroup is enabled: https://github.com/eclipse-openj9/openj9/blob/cc586f03bd5359157f99fb342015f24f4e064755/runtime/gc_base/GCExtensions.cpp#L254-L255 Is this correct?

dmitripivkine commented 2 years ago

As you pointed correctly in MM_GCExtensions::computeDefaultMaxHeapForJava() we set memoryMax for cgroup and adjust it later to 25% of available RAM if it is larger (for Java 11 and up only, Java 8 has hardcoded 512m limit). I guess you should not touch this logic but make omrsysinfo_cgroup_are_subsystems_enabled() and omrsysinfo_cgroup_is_memlimit_set() work for cgroup v2

EricYangIBM commented 2 years ago

In omrsysinfo_cgroup_is_memlimit_set: https://github.com/eclipse/omr/blob/644b9078b90a441a73da0ab9da66dc60d283033a/port/unix/omrsysinfo.c#L5967-L5974 My cgroups v1 machine had a memory.limit_in_bytes as 9223372036854771712 for a simple java test process which would trigger this error. So basically if the cgroup memlimit is not set we don't expand the heap size (the usablePhysicalMemory was not obtained from the cgroup memory limit file).

The following was a docker with v1 cgroup:

root@658e20312005:~/hostdir/openj9-openjdk-jdk8# build/linux-x86_64-normal-server-release/images/j2sdk-image/bin/java -XshowSettings:vm -XX:+OriginalJDK8HeapSizeCompatibilityMode -version
VM settings:
    Max. Heap Size (Estimated): 512.00M
...

To confirm, the memory limit was not set for this process (125):

root@658e20312005:/# cat /sys/fs/cgroup/memory/cgroup.procs 
1
115
125
153
root@658e20312005:/# cat /sys/fs/cgroup/memory/memory.limit_in_bytes 
9223372036854771712

So even with cgroup v1 the max heap size is 512M. I'm not sure this is because cgroups v2 is not implemented.

babsingh commented 2 years ago

So even with cgroup v1 the max heap size is 512M. I'm not sure this is because cgroups v2 is not implemented.

@EricYangIBM Study the behaviour reported in https://github.com/ibmruntimes/ci.docker/issues/124#issue-1084626215.

@dmitripivkine In https://github.com/ibmruntimes/ci.docker/issues/124#issue-1084626215, Max. Heap Size (Estimated): 3.00G is shown for J9 Java8. The hardcoded 512m limit is not being enforced for Java8.

EricYangIBM commented 2 years ago

Maybe that container has a cgroup limit enforced? That would explain the heap extended by https://github.com/eclipse-openj9/openj9/blob/cc586f03bd5359157f99fb342015f24f4e064755/runtime/gc_base/GCExtensions.cpp#L254-L255

babsingh commented 2 years ago

Maybe that container has a cgroup limit enforced?

Yes, I think it is enforced via docker run -m 4GB.

babsingh commented 2 years ago

Task 6 for cgroup v2 is complete (reference: https://github.com/eclipse/omr/issues/1281#issuecomment-1072796875). This means that cgroup v2 has the same functionality as cgroup v1. We should be able to verify if customer issues are fixed.

re https://github.com/eclipse-openj9/openj9/issues/14190#issuecomment-1068441339: @mpirvu Can you please confirm if the not be able to use AOT and experience a start-up slowdown issue is resolved?

mpirvu commented 2 years ago

Can you please confirm if the not be able to use AOT and experience a start-up slowdown issue is resolved?

I will once nightly builds are available with the PR that was merged today

babsingh commented 2 years ago

I will once nightly builds are available with the PR that was merged today

Referring to https://github.com/eclipse/omr/issues/1281#issuecomment-1072796875, the AOT issue should have been resolved by Tasks 1 and 2. Today's PR only updates the cgroup stats in the javacore.

babsingh commented 2 years ago

For the original issue, @EricYangIBM had verified that https://github.com/ibmruntimes/ci.docker/issues/124 was resolved in https://github.com/eclipse/omr/pull/6432#issuecomment-1096810188. So, it should also have been resolved by Tasks 1 and 2 in https://github.com/eclipse/omr/issues/1281#issuecomment-1072796875.

fyi @lgrateau

mpirvu commented 2 years ago

I verified on a Ubuntu 22.04 machine that the latest nightly build of OpenJ9 correctly determines that is running in container and sees the correct CPU and memory limits (on the same machine, the existing OpenJ9 0.32.0 does not read these limits correctly)

babsingh commented 2 years ago

@tajila Why did we move this issue from the 0.33 to 0.34 release? As of now, cgroup v2 support is identical to cgroup v1 support. So, this issue can be closed once the end users verify the fix.

tajila commented 2 years ago

Keeping it to track the tests changes. If you have another item for that we can close this.

babsingh commented 2 years ago

If you have another item for that we can close this.

Those tests are OMR specific. In addition to those tests, all pending cgroup related OMR work is being tracked in https://github.com/eclipse/omr/issues/1281#issuecomment-1072796875.

This OpenJ9 issue was primarily opened to address cgroup related user issues. Since those issues have been resolved, we should close this issue.