cAdvisor giving the metrics twice

Swetad90 commented 5 years ago

We are using the below versions and cAdvisor from the Kubelet. Our kubelet us running as a systemd process.

Kubernetes: 1.13

When we run any query from cAdvisor we are getting metrics in these form:

container_last_seen{cluster="staging-1",container_name="etcd",image="sha256:28c771d7cfbf436cc2471523350d58a75a4c28a7e8684b1dd54b7e8ba321f84b",instance="staging-1",job="kubernetes-cadvisor",namespace="kube-system"}

container_last_seen{cluster="staging-1",image="quay.io/coreos/etcd:v3.2.24",instance="staging-1",job="kubernetes-cadvisor"} container_last_seen{cluster="staging-1",image="quay.io/coreos/etcd:v3.2.24",instance="staging-2",job="kubernetes-cadvisor"} container_last_seen{cluster="staging-1",image="quay.io/coreos/etcd:v3.2.24",instance="staging-3",job="kubernetes-cadvisor"}

Could anyone tell me why cAdvisor is giving the metrics in this format?

dashpole commented 5 years ago

Can you include the complete set of labels? They should have pod labels as well.

Swetad90 commented 5 years ago

This metric shows the given all labels. Nothing else.

Is there any other metric you would want to see ?

dashpole commented 5 years ago

cAdvisor also doesn't add a "job" label. This looks like metrics after the prometheus server has done some relabeling. Can you directly query the cadvisor endpoint (localhost:10255/metrics/cadvisor) on the node, and paste the output?

Swetad90 commented 5 years ago

I re-added the "id" label back to the metrics. Now, I think I know what those additional metrics were for:

container_cpu_usage_seconds_total{cluster="staging-1",cpu="total",id="/docker/d5e5c6db5cfef95eb16d731a23451b8210e80d04e7624914ae15e491f0aad2a8",image="quay.io/coreos/etcd:v3.2.24",instance="staging-1",job="kubernetes-cadvisor",name="etcd2"} |

container_cpu_usage_seconds_total{cluster="staging-1",cpu="total",id="/docker/f38afc889f4785a392d4d0a5511965cee23eee1aa7ddc75a2aae0686719694f2",image="quay.io/coreos/etcd:v3.2.24",instance="staging-1",job="kubernetes-cadvisor",name="etcd3"} |

container_cpu_usage_seconds_total{cluster="staging-1",cpu="total",id="/kubepods",instance="staging-1",job="kubernetes-cadvisor"}

It's giving metrics with id's from system services ie /system.slice/rng-tools.service. Is there any way to filter these from cAdvisor or we have to handle this at Prometheus end ?

dashpole commented 5 years ago

The system services shouldn't end up looking like what you were seeing in #1, since they shouldn't have an image field. Are you sure that is where they were coming from?

Swetad90 commented 5 years ago

Apologies. What I meant was along with the container metrics I am getting the system metrics as well.

Here is just example of etcd:

container_last_seen{cluster="staging-1",container_name="etcd",id="/kubepods/besteffort/poddc2258bd-7ae6-11e9-b039-005056bcc0bd/567873cfb06a48a87ade1db75bac7d492d5639860affa7c17c38f37497e470b8",image="sha256:28c771d7cfbf436cc2471523350d58a75a4c28a7e8684b1dd54b7e8ba321f84b",instance="staging-1-node1",job="kubernetes-cadvisor",name="k8s_etcd_kube-system-79447d6b94-v7tzg_kube-system_dc2258bd-7ae6-11e9-b039-005056bcc0bd_1",namespace="kube-system",pod_name="trident-79447d6b94-v7tzg"}

container_last_seen{cluster="staging-1",id="/docker/1e2ab4c6c9a5c59b3f94b178608bb4cd6d3a88a839d3e949d8ece511578544d8",image="quay.io/coreos/etcd:v3.2.24",instance="staging-1",job="kubernetes-cadvisor",name="etcd1"} container_last_seen{cluster="staging-1",id="/docker/d5e5c6db5cfef95eb16d731a23451b8210e80d04e7624914ae15e491f0aad2a8",image="quay.io/coreos/etcd:v3.2.24",instance="staging-2",job="kubernetes-cadvisor",name="etcd2"} container_last_seen{cluster="staging-1",id="/docker/f38afc889f4785a392d4d0a5511965cee23eee1aa7ddc75a2aae0686719694f2",image="quay.io/coreos/etcd:v3.2.24",instance="staging-3",job="kubernetes-cadvisor",name="etcd3"}

This is just the metrics for etcd how I get. I get metrics for the 3 containers in each node and then I get one for the pod.

dashpole commented 5 years ago

Ok, that was my suspicion. Feel free to reopen if you have further questions

Swetad90 commented 5 years ago

But why do i get one for pod of etcd when there is none ? I just have 3 etcd containers outside the cluster

dashpole commented 5 years ago

How are you running those containers? Just with docker?

google / cadvisor

cAdvisor giving the metrics twice #2272