This PR fixes a bug in the cgroup module of process metrics collection when both cgroup V1 and V2 controllers exist but /sys/fs/cgroups/unified is not mounted.
This case can exist on Kubernetes nodes where systemd is configured to use the cgroup V1 hierarchy but a service needs to mount the cgroup V2 hierarchy for its own use (This is what Cilium does for example). In that case, the proc tree contains a reference to a empty cgroup V2 controller which confuses metricbeat which report a PID XXX contains a cgroups V2 path (0::/) but no V2 mountpoint was found.\nThis may be because metricbeat is running inside a container on a hybrid system.\nTo monitor cgroups V2 processess in this way, mount the unified (V2) hierarchy inside\nthe container as /sys/fs/cgroup/unified and start the system module with the hostfs setting. error.
Why is it important?
This fixes gathering process metrics on Amazon Kubernetes nodes running Cilium.
Checklist
[x] My code follows the style guidelines of this project
[x] I have commented my code, particularly in hard-to-understand areas
[x] I have added tests that prove my fix is effective or that my feature works
[x] I have added an entry in CHANGELOG.md
Author's Checklist
This PR is split in 2 commits:
a test case to reproduce the issue with the existing code
a logic change in to fix the issue
Running tests with only the first commit leads to this error:
? github.com/elastic/elastic-agent-system-metrics/metric/system/cpu [no test files]
--- FAIL: TestReaderGetStatsV1MalformedHybrid (0.00s)
reader_test.go:92:
Error Trace: reader_test.go:92
Error: Received unexpected error:
error fetching cgroupV2 controllers for cgroup location '' and path line '0::/': open testdata/amzn2/sys/fs/cgroup/unified: no such file or directory
Test: TestReaderGetStatsV1MalformedHybrid
Messages: error in GetV1StatsForProcess
--- FAIL: TestProcessCgroupHybridPaths (0.00s)
util_test.go:212: error in ProcessCgroupPaths: error fetching cgroupV2 controllers for cgroup location '' and path line '0::/': open testdata/amzn2/sys/fs/cgroup/unified: no such file or directory
FAIL
What does this PR do?
This PR fixes a bug in the cgroup module of process metrics collection when both cgroup V1 and V2 controllers exist but
/sys/fs/cgroups/unified
is not mounted.This case can exist on Kubernetes nodes where systemd is configured to use the cgroup V1 hierarchy but a service needs to mount the cgroup V2 hierarchy for its own use (This is what Cilium does for example). In that case, the
proc
tree contains a reference to a empty cgroup V2 controller which confuses metricbeat which report aPID XXX contains a cgroups V2 path (0::/) but no V2 mountpoint was found.\nThis may be because metricbeat is running inside a container on a hybrid system.\nTo monitor cgroups V2 processess in this way, mount the unified (V2) hierarchy inside\nthe container as /sys/fs/cgroup/unified and start the system module with the hostfs setting.
error.Why is it important?
This fixes gathering process metrics on Amazon Kubernetes nodes running Cilium.
Checklist
CHANGELOG.md
Author's Checklist
This PR is split in 2 commits:
Running tests with only the first commit leads to this error:
Related issues
Relates to https://github.com/elastic/beats/issues/37211