elastic / elastic-agent-system-metrics

Apache License 2.0
0 stars 22 forks source link

Fix behavior of cgroups path discovery #136

Closed fearful-symmetry closed 2 months ago

fearful-symmetry commented 3 months ago

What does this PR do?

Fixes https://github.com/elastic/elastic-agent-system-metrics/issues/132 and fixes https://github.com/elastic/elastic-agent-system-metrics/issues/139

This sanitizes the "relative" namespace mounts that we get when we monitor the host system from within newer versions of docker.

This also adds much more complex logic for setting the v2 rootpath we use for fetching metrics from v2 cgroups paths.

This also aggressively cleans up the unit tests in cgroup because they were a mess

With this change, cgroups metric collection works properly on docker configs where where the container is using a private namespace. This value is configurable from docker 1.41+

Checklist

How to test this PR

fearful-symmetry commented 2 months ago

Decided to just fix https://github.com/elastic/elastic-agent-system-metrics/issues/139 while I'm at it, so I can at least run the tests.

fearful-symmetry commented 2 months ago

Okay cleaning up the unit tests is harder than I thought

fearful-symmetry commented 2 months ago

Alright, the fix is still kind of broken for cases where we're trying to monitor a process on the host system that doesn't belong to a root cgroup. I think we need to use the host paths to find the actual cgroup on the host system, and then construct the bath inside hostfs from within the container.

fearful-symmetry commented 2 months ago

So, tests are finally passing. One thing I'd like to add is some kind of manual bypass/setting for ContainerizedRootMount, as a sort of failsafe, in case there's some weird edge case I haven't been able to tease out.