kubernetes / kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/
Apache License 2.0
5.2k stars 1.92k forks source link

[WIP] fix: Add allowed labels to all metrics instead of just `_labels` metric #2428

Open ronaknnathani opened 1 week ago

ronaknnathani commented 1 week ago

What this PR does / why we need it: KSM allows providing a list of labels that should be added to the prometheus metric for the resource. These labels currently are only added to the kube_<resource>_labels metric. However, there are several use cases where having these labels on all of the metrics for the resource would be helpful. This change adds the allowed labels to all metrics of the resource instead of just the _labels metric.

[!Note] Currently, I have only made the change for Pod metrics. If this direction looks good, I can update the PR for rest of the resources.

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality) Doesn't change the number of metrics. Increases the labels on each prometheus metric based on allowed labels.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Fixes https://github.com/kubernetes/kube-state-metrics/issues/2311

Tests Tested locally against a kind cluster. Just showing a handful of metrics here.

kube_pod_container_info{namespace="kube-system",pod="kindnet-l4529",uid="5e107535-3135-48fb-8f42-043c50866177",container="kindnet-cni",image_spec="docker.io/kindest/kindnetd:v20230511-dc714da8",image="docker.io/kindest/kindnetd:v20230511-dc714da8",image_id="sha256:b18bf71b941bae2e12db1c07e567ad14e4febbc778310a0fc64487f1ac877d79",container_id="containerd://d37637e9cfc42cb4f9f606de57d5ced7fb242827e2b4d8860e6c3a28ca877136",label_app="kindnet"} 1
kube_pod_container_info{namespace="monitoring",pod="event-exporter-7ddc6ff9b-wd8f5",uid="a021852f-75f6-4e1c-975a-103396730b68",container="event-exporter",image_spec="kubernetes-event-exporter:local",image="docker.io/library/kubernetes-event-exporter:local",image_id="docker.io/library/import-2024-01-17@sha256:73a75a4829b55788efebd1ac7679256c36a09216c0384363716c3ae22e5fba5b",container_id="containerd://a8643a9cf05a6e7d708cd685822fc3799ba45655f58c5d30930cd21e545f8a22",label_app="event-exporter"} 1
...
kube_pod_container_resource_limits{namespace="rnathani",pod="web-server-869d57bbd4-vljdh",uid="085c44c5-b992-4525-a92c-266d32bfeaa6",container="web-server",node="kind-control-plane",resource="memory",unit="byte",label_app="web-server"} 1.073741824e+09
kube_pod_container_resource_limits{namespace="kube-system",pod="kindnet-l4529",uid="5e107535-3135-48fb-8f42-043c50866177",container="kindnet-cni",node="kind-control-plane",resource="cpu",unit="core",label_app="kindnet"} 0.1
...
kube_pod_container_state_started{namespace="monitoring",pod="event-exporter-7ddc6ff9b-s6zfg",uid="fe1e9f75-3118-43cf-b48c-1ac347152a6b",container="event-exporter",label_app="event-exporter"} 1.716671651e+09
kube_pod_container_state_started{namespace="local-path-storage",pod="local-path-provisioner-6f8956fb48-sv66c",uid="33ebe75c-56be-4208-bf3b-7e1d9546738b",container="local-path-provisioner",label_app="local-path-provisioner"} 1.716671697e+09
...
kube_pod_container_status_ready{namespace="rnathani",pod="web-server-869d57bbd4-vljdh",uid="085c44c5-b992-4525-a92c-266d32bfeaa6",container="web-server",label_app="web-server"} 1
kube_pod_container_status_ready{namespace="kube-system",pod="kindnet-l4529",uid="5e107535-3135-48fb-8f42-043c50866177",container="kindnet-cni",label_app="kindnet"} 1
...
k8s-ci-robot commented 1 week ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ronaknnathani Once this PR has been reviewed and has the lgtm label, please assign rexagod for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/kubernetes/kube-state-metrics/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
k8s-ci-robot commented 1 week ago

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
k8s-ci-robot commented 1 week ago

Welcome @ronaknnathani!

It looks like this is your first PR to kubernetes/kube-state-metrics 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kube-state-metrics has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

ronaknnathani commented 1 week ago

@mrueg @CatherineF-dev - just checking if you got a chance to take a look. if this direction is okay, I can update the PR for rest of the resources or also send different PRs for different resources.

mrueg commented 3 days ago

I don't think this is a good idea to add all labels to all metrics as the cardinality will go up by a lot. Instead you should join the _labels metric into the metric you want identified by the keys. Usually you can do this either per query you run or if you have a specific need use a recording rule.

/hold