google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
17.14k stars 2.32k forks source link

container_network metrics for pod networking pods report are confusing #3393

Open nrobert13 opened 1 year ago

nrobert13 commented 1 year ago

I'm running kubernetes v1.26.6 with calico in vxlan mode. Looking at the cAdvisor metrics, I noticed that the container_network_* metrics for pods running with pod networking ( no host networking ) additional label combinations are being reported, for instance:

filtering the metrics by one cali* interface, ( every pod with pod networking has a veth interface called cali* ), I get the following metrics:

PromQL:

container_network_receive_packets_total{node="worker-644d7bdb55-xcrhj", interface=~"cali248950e0d8c"}

reports:

container_network_receive_packets_total{endpoint="cadvisor", id="/", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", node="worker-644d7bdb55-xcrhj", service="prometheus-kubelet"} | 239977
container_network_receive_packets_total{endpoint="cadvisor", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod67c629a2_d18f_451a_9d89_f65758e13e12.slice/cri-containerd-9e6914890be99324e095c414b7400d6462b89be1b5636e56e228e8f4747f8fb5.scope", image="registry.k8s.io/pause:3.6", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", name="9e6914890be99324e095c414b7400d6462b89be1b5636e56e228e8f4747f8fb5", namespace="monitoring", node="worker-644d7bdb55-xcrhj", pod="node-exporter-nnbtg", service="prometheus-kubelet"} | 240008
container_network_receive_packets_total{endpoint="cadvisor", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod84da0379_d8c6_4f55_91c6_19c93cccf317.slice/cri-containerd-80c8f738f3172634e5bf1eca12f82263f4a29d60ca6f37170837f65e64aacfab.scope", image="registry.k8s.io/pause:3.6", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", name="80c8f738f3172634e5bf1eca12f82263f4a29d60ca6f37170837f65e64aacfab", namespace="monitoring", node="worker-644d7bdb55-xcrhj", pod="conntrack-exporter-7b2kg", service="prometheus-kubelet"} | 239977
container_network_receive_packets_total{endpoint="cadvisor", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod85e8fcf0_1851_4354_9e9c_162eb3035f1b.slice/cri-containerd-3895537a5be0a99bb30e5cea0d40a84a36bccecf0d0c53bdb0e0c82f491ffec6.scope", image="registry.k8s.io/pause:3.6", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", name="3895537a5be0a99bb30e5cea0d40a84a36bccecf0d0c53bdb0e0c82f491ffec6", namespace="kube-system", node="worker-644d7bdb55-xcrhj", pod="kube-proxy-mvxt2", service="prometheus-kubelet"} | 240002
container_network_receive_packets_total{endpoint="cadvisor", id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod9e7ed417_4350_46d6_83c2_7f90ea235322.slice/cri-containerd-16b4ac53937bbc8e55c60cc558163829437b2c121ef65404ea63dcfec2dc693a.scope", image="registry.k8s.io/pause:3.6", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", name="16b4ac53937bbc8e55c60cc558163829437b2c121ef65404ea63dcfec2dc693a", namespace="kube-system", node="worker-644d7bdb55-xcrhj", pod="csi-cinder-nodeplugin-fv9dm", service="prometheus-kubelet"} | 239955
container_network_receive_packets_total{endpoint="cadvisor", id="/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod426cbd97_c85d_43e4_8cc4_ca66706bd740.slice/cri-containerd-cd828946df09db9c5603dce122d0820e6e532053befb1a702244782dedc009a7.scope", image="registry.k8s.io/pause:3.6", instance="192.168.0.100:4194", interface="cali248950e0d8c", job="cadvisor", name="cd828946df09db9c5603dce122d0820e6e532053befb1a702244782dedc009a7", namespace="kube-system", node="worker-644d7bdb55-xcrhj", pod="calico-node-fqkwq", service="prometheus-kubelet"} | 239949

notice that the metric is for interface cali248950e0d8c and contains label for cgroups for different pods calico-node-fqkwq , csi-cinder-nodeplugin-fv9dm, kube-proxy-mvxt2, conntrack-exporter-7b2kg and node-exporter-nnbtg. All these pods run with hostneworking, and are being reported for each cali* interface.

I'm confused about this behavior, and not sure if my understanding is wrong or this is actually a bug.

I would have expected one counter for each pod with podnetworking without these additional labels for the mentioned pods. Thanks in advance for any input.

sidewinder12s commented 8 months ago

I think we also just discovered this for the AWS VPC CNI and it is exploding the cardinality of the network metrics and it does appear the duplicated interface labels are only for pods with Host Networking enabled. But its unclear to me which interface is correct if not the first interface which at least for Amazon Linux 2 is eth0