elastic / elastic-agent-system-metrics

Apache License 2.0
0 stars 22 forks source link

Fix process metrics collection when both cgroup V1 and V2 controllers exist #120

Closed ydubreuil closed 7 months ago

ydubreuil commented 7 months ago

What does this PR do?

This PR fixes a bug in the cgroup module of process metrics collection when both cgroup V1 and V2 controllers exist but /sys/fs/cgroups/unified is not mounted.

This case can exist on Kubernetes nodes where systemd is configured to use the cgroup V1 hierarchy but a service needs to mount the cgroup V2 hierarchy for its own use (This is what Cilium does for example). In that case, the proc tree contains a reference to a empty cgroup V2 controller which confuses metricbeat which report a PID XXX contains a cgroups V2 path (0::/) but no V2 mountpoint was found.\nThis may be because metricbeat is running inside a container on a hybrid system.\nTo monitor cgroups V2 processess in this way, mount the unified (V2) hierarchy inside\nthe container as /sys/fs/cgroup/unified and start the system module with the hostfs setting. error.

Why is it important?

This fixes gathering process metrics on Amazon Kubernetes nodes running Cilium.

Checklist

Author's Checklist

This PR is split in 2 commits:

Running tests with only the first commit leads to this error:

?       github.com/elastic/elastic-agent-system-metrics/metric/system/cpu   [no test files]
--- FAIL: TestReaderGetStatsV1MalformedHybrid (0.00s)
    reader_test.go:92: 
            Error Trace:    reader_test.go:92
            Error:          Received unexpected error:
                            error fetching cgroupV2 controllers for cgroup location '' and path line '0::/': open testdata/amzn2/sys/fs/cgroup/unified: no such file or directory
            Test:           TestReaderGetStatsV1MalformedHybrid
            Messages:       error in GetV1StatsForProcess
--- FAIL: TestProcessCgroupHybridPaths (0.00s)
    util_test.go:212: error in ProcessCgroupPaths: error fetching cgroupV2 controllers for cgroup location '' and path line '0::/': open testdata/amzn2/sys/fs/cgroup/unified: no such file or directory
FAIL

Related issues

Relates to https://github.com/elastic/beats/issues/37211