kubernetes / kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/
Apache License 2.0
5.44k stars 2.02k forks source link

Few Metrics are not available for all namespaces. #2558

Open lazyboson opened 3 days ago

lazyboson commented 3 days ago

What happened:

I have deployed kube-state-metrics on eks cluster, on which multiple pods are deployed across multiple namespaces. Now what we are seeing few metrics are not scrapped for all namespaces.

example metrics - kube_pod_container_resource_requests

kube_pod_container_resource_requests{namespace="kube-system",pod="aws-node-vvhxn",uid="0c8bfb9b-6c1e-40de-bd9c-bde0c17631f8",container="aws-node",node="ip-20-2-69-154.ec2.internal",resource="cpu",unit="core"} 0.025
kube_pod_container_resource_requests{namespace="otel",pod="otelcollector-opentelemetry-collector-agent-fzlz7",uid="af0a329a-d080-4ab5-8bab-88b292afafa9",container="opentelemetry-collector",node="ip-20-1-55-132.ec2.internal",resource="cpu",unit="core"} 0.25

kube_pod_container_resource_requests{namespace="ivr",pod="context-6cf8bf9b7c-mhn8f",uid="623148e3-8516-40f0-8050-4ee8c2050c3d",container="context",node="ip-20-2-69-154.ec2.internal",resource="cpu",unit="core"} 0.2

kube_pod_container_resource_requests{namespace="ivr",pod="converter-6b555f6f78-ssbvk",uid="3dc683e4-1da8-45e7-850a-7d8bc009371e",container="converter",node="ip-20-2-69-154.ec2.internal",resource="cpu",unit="core"} 0.25

kube_pod_container_resource_requests{namespace="ivr",pod="converter-6b555f6f78-ssbvk",uid="3dc683e4-1da8-45e7-850a-7d8bc009371e",container="converter",node="ip-20-2-69-154.ec2.internal",resource="memory",unit="byte"} 2.68435456e+08

kube_pod_container_resource_requests{namespace="ivr",pod="crm-5599bc6f7d-b9wc4",uid="fb443c98-1e0a-4bc8-9bd3-f02e667afc1b",container="crm",node="ip-20-3-156-43.ec2.internal",resource="cpu",unit="core"} 0.1
kube_pod_container_resource_requests{namespace="ivr",pod="crm-5599bc6f7d-b9wc4",uid="fb443c98-1e0a-4bc8-9bd3-f02e667afc1b",container="crm",node="ip-20-3-156-43.ec2.internal",resource="memory",unit="byte"} 2.68435456e+08
kube_pod_container_resource_requests{namespace="ivr",pod="event-7c4bbcb478-hpkdd",uid="b8605050-640c-482b-a040-b08f0665f713",container="event",node="ip-20-3-156-43.ec2.internal",resource="cpu",unit="core"} 0.2
kube_pod_container_resource_requests{namespace="ivr",pod="event-7c4bbcb478-hpkdd",uid="b8605050-640c-482b-a040-b08f0665f713",container="event",node="ip-20-3-156-43.ec2.internal",resource="memory",unit="byte"} 5.36870912e+08

this metrics is available for 5 namespaces whereas we have 19 namespaces in system and all have pod running.

to prove that -

kube_pod_status_phase{namespace="kube-system",pod="ebs-csi-node-dfdhc",uid="f7956ef1-9d49-4679-b5c6-5e634537e972",phase="Running"} 1
kube_pod_status_phase{namespace="otel",pod="otelcollector-opentelemetry-collector-agent-l7nt9",uid="c90476e3-e677-44d9-a086-3a5d5781b11d",phase="Running"} 1
kube_pod_status_phase{namespace="cert-manager",pod="cert-manager-cainjector-54f964d4b7-nf2jv",uid="dcc5c4ff-ee61-43ca-a00d-3d32a44e2d0c",phase="Running"} 1
kube_pod_status_phase{namespace="bot-va",pod="va-app-6d9654584b-s6g7v",uid="ec10e979-5394-43d2-9ecd-dc29377f6dfa",phase="Running"} 1
kube_pod_status_phase{namespace="eqa",pod="ams-deployment-676464f985-lrzx7",uid="fc3e1422-5a91-4615-a2c9-f379b96b9e1f",phase="Running"} 1
kube_pod_status_phase{namespace="logging",pod="fluent-bit-rbsdt",uid="d7dcc888-46fa-4f04-86a9-bd4639deeda5",phase="Running"} 1
kube_pod_status_phase{namespace="queue",pod="queueserver-7457848c88-2cwwf",uid="0e9ddaa3-ab59-42c5-ab4b-869049d1e1c9",phase="Running"} 1
kube_pod_status_phase{namespace="voice",pod="appserver-5b6cd65969-rzfcj",uid="f778f355-4616-4ce5-9a6e-bd7da6072c02",phase="Running"} 1

kube_pod_status_phase{namespace="eqa",pod="analytics-deployment-58df6b6847-rffzm",uid="3510d60d-0168-49c9-b66c-033fdb3151d9",phase="Running"} 1

kube_pod_status_phase{namespace="eqa",pod="user-session-event-processor-deployment-8c5b7ccc6-54p4f",uid="2f87c25e-d6b2-449c-b97f-17a02a0b7c29",phase="Succeeded"} 0
kube_pod_status_phase{namespace="eqa",pod="user-session-event-processor-deployment-8c5b7ccc6-54p4f",uid="2f87c25e-d6b2-449c-b97f-17a02a0b7c29",phase="Failed"} 0
kube_pod_status_phase{namespace="eqa",pod="user-session-event-processor-deployment-8c5b7ccc6-54p4f",uid="2f87c25e-d6b2-449c-b97f-17a02a0b7c29",phase="Unknown"} 0
kube_pod_status_phase{namespace="eqa",pod="user-session-event-processor-deployment-8c5b7ccc6-54p4f",uid="2f87c25e-d6b2-449c-b97f-17a02a0b7c29",phase="Running"} 1

kube_pod_status_phase{namespace="kube-system",pod="aws-load-balancer-controller-676b77c78f-cqql2",uid="535a1aab-d385-4b16-bb34-aff808a64528",phase="Succeeded"} 0
kube_pod_status_phase{namespace="kube-system",pod="aws-load-balancer-controller-676b77c78f-cqql2",uid="535a1aab-d385-4b16-bb34-aff808a64528",phase="Failed"} 0
kube_pod_status_phase{namespace="kube-system",pod="aws-load-balancer-controller-676b77c78f-cqql2",uid="535a1aab-d385-4b16-bb34-aff808a64528",phase="Unknown"} 0
kube_pod_status_phase{namespace="kube-system",pod="aws-load-balancer-controller-676b77c78f-cqql2",uid="535a1aab-d385-4b16-bb34-aff808a64528",phase="Running"} 1

kube_pod_status_phase{namespace="kube-system",pod="ebs-csi-node-8z4c8",uid="8b280198-f39a-4979-a885-0ca974e67d84",phase="Running"} 1

What you expected to happen: all namespace pod metrics should come.

Anything else we need to know?:

Environment:

k8s-ci-robot commented 3 days ago

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.