coroot / coroot-node-agent

A Prometheus exporter based on eBPF that gathers comprehensive container metrics
https://coroot.com/docs/metrics/node-agent
Apache License 2.0
312 stars 55 forks source link

add support for containers with multiple network namespaces #20

Closed def closed 1 year ago

def commented 1 year ago

This PR fixes network metrics for the case when processes within a container work in different network namespaces. An example of this scenario is the cilium-agent container:

Both cilium-agent and cilium-health-responder run in the same container:

root       14629  1.7  2.4 873516 96280 ?        Ssl  May31  19:11 cilium-agent --config-dir=/tmp/cilium/config-map
root       15134  0.0  0.0 715232  2492 ?        Sl   May31   0:03 cilium-health-responder --listen 4240 --pidfile /var/run/cilium/state/health-endpoint.pid

Here are the cgroup entries for the processes:

$ cat /proc/14629/cgroup
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod54248030_f758_4562_927b_945cbcf29fb4.slice/cri-containerd-597e1d658d05ab68f4e43788271ccc1c87d306e1cb671f1bf691a60148834461.scope

$ cat /proc/15134/cgroup
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod54248030_f758_4562_927b_945cbcf29fb4.slice/cri-containerd-597e1d658d05ab68f4e43788271ccc1c87d306e1cb671f1bf691a60148834461.scope

However, these processes work in different network namespaces:

$ ls -la /proc/14629/ns/net
lrwxrwxrwx 1 root root 0 Jun  1 08:33 /proc/14629/ns/net -> 'net:[4026531840]'

$ ls -la /proc/15134/ns/net
lrwxrwxrwx 1 root root 0 May 31 15:37 /proc/15134/ns/net -> 'net:[4026532538]'