OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.15k stars 592 forks source link

CpuInfo needs to handle "cgroups v2" filesystem structure #24588

Open gjdeval opened 1 year ago

gjdeval commented 1 year ago

CpuInfo scrapes the filesystem for info about cgroup cpu limits, which are used by Linux containers.

In the 4.5 kernel, a new filesystem structure has been introduced as part of "cgroups v2", so CpuInfo will not find the cpu limits data. CpuInfo needs to be updated so it can read cpu limit data for "cgroups v2" environments, as well as the original cgroup structure.

Until this issue is resolved, when Liberty is deployed in a container or Kube pod on a 4.5+ Linux kernel, Liberty functions that depend on the number of cpus available will depend on the JDK for that info. If "cpusAvailable" is not provided by the JDK, those Liberty functions will not operate as expected. In particular, java/lang/Runtime.availableProcessors() returns an integer, so when a pod is deployed with a fractional cpu allocation, CpuInfo will report cpuUsage incorrectly because it will not have the allocated cpu-fraction available to use as the denominator of the cpuUsage calculation.

gjdeval commented 1 year ago

It is also possible for cgroups to be configured with both v1 and v2 elements. Liberty's implementation should be able to handle this case, which is apparently common in AKS 1.25.

NottyCode commented 1 year ago

I see this is in the Open Liberty roadmap for triage, but it is also a release bug which don't usually get prioritized? Should it be prioritized or removed from the roadmap?

tevans78 commented 1 year ago

Discussed this with Gary. This issue is just a bug report. Will leave it to Chuck to decide if a feature is required to fix it.

tbitonti commented 1 year ago

Refs: "cgroups v2" see https://docs.kernel.org/admin-guide/cgroup-v2.html "CPUInfo.java" see https://github.com/OpenLiberty/open-liberty/blob/integration/dev/com.ibm.ws.kernel.service/src/com/ibm/ws/kernel/service/util/CpuInfo.java

tbitonti commented 1 year ago

The relevant code seems to be from CpuInfo:

    // utility below parses cpu limits info from Docker files
    private static float getAvailableProcessorsFromFilesystemFloat() {
        boolean isTraceOn = TraceComponent.isAnyTracingEnabled();

        float availableProcessorsFloat = -1;

        //Check for docker files
        String periodFileLocation = File.separator + "sys" + File.separator + "fs" + File.separator + "cgroup" + File.separator + "cpu" + File.separator + "cpu.cfs_period_us";
        String quotaFileLocation = File.separator + "sys" + File.separator + "fs" + File.separator + "cgroup" + File.separator + "cpu" + File.separator + "cpu.cfs_quota_us";
        File cfsPeriod = new File(periodFileLocation);
        File cfsQuota = new File(quotaFileLocation);
        if (cfsPeriod.exists() && cfsQuota.exists()) { //Found docker files
tbitonti commented 1 year ago

How do I know if my cgroup is v1 or v2? If /sys/fs/cgroup/cgroup. controllers is present on your system, you are using v2, otherwise you are using v1. The following distributions are known to use cgroup v2 by default: Fedora (since 31)

tbitonti commented 1 year ago

What is the difference between cgroups 1 and 2? TLDR Understanding the new cgroups v2 API by Rami Rosen | by ... In cgroups v1, a process can belong to many subgroups, if those subgroups are in different hierarchies with different controllers attached. But, because belonging to more than one subgroup made it difficult to disambiguate subgroup membership, in cgroups v2, a process can belong only to a single subgroup.

tbitonti commented 1 year ago

Ref to an openJ9 fix to the change: https://github.com/eclipse/omr/pull/6432