Open jacobstr opened 2 weeks ago
This issue is currently awaiting triage.
If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
What would you like to be added:
In large clusters, pod replicas often represent a "class" or "category" or worker. It's really useful to get metrics broken down by e.g. the pod's controller, or some common label that categorizes pods without the cardinality of tens of thousands of pod IDs.
The time series'
kube_pod_labels
covers some ground, but requires joins. Joins, for usability, often engender recording rules. Recording rules in turn, demand stateful prometheus servers which e.g. 4x the memory footprint in systems like prometheus running in agent mode, or grafana's alloy. Particularly, if one is forwarding the metrics to a 3rd party it's nice to minimize the footprint of in-house infra by keeping things stateless as long as possible - certainly at some point you hit a big beefy backend that you can query.Why not skip the label join requirement entirely? And allow for configurable appending of specific kube pod labels to the underlying time series instead. Again, this works incredibly well with "stateless" prometheus remote-writers and system's like Grafana's adaptive metrics that can do some limited metric aggregation etc without a full recording rule engine.
Why is this needed:
kube_pod_container_status_restarts_total
categorized by a workload without having to jump into recording rules.Describe the solution you'd like
kube_pod_*
time series' without requiring a label join.Additional context
kube_pod_
time series, that does stretch the command line arguments approach though and it might just make sense to leverage the existing allow list and simply have a secondary boolean to enable / disable the label (and I suppose annotation for symmetry...) propagation.