Propagate select pod labels to all metrics without requiring promql metric joins.

What would you like to be added:

In large clusters, pod replicas often represent a "class" or "category" or worker. It's really useful to get metrics broken down by e.g. the pod's controller, or some common label that categorizes pods without the cardinality of tens of thousands of pod IDs.

The time series' kube_pod_labels covers some ground, but requires joins. Joins, for usability, often engender recording rules. Recording rules in turn, demand stateful prometheus servers which e.g. 4x the memory footprint in systems like prometheus running in agent mode, or grafana's alloy. Particularly, if one is forwarding the metrics to a 3rd party it's nice to minimize the footprint of in-house infra by keeping things stateless as long as possible - certainly at some point you hit a big beefy backend that you can query.

Why not skip the label join requirement entirely? And allow for configurable appending of specific kube pod labels to the underlying time series instead. Again, this works incredibly well with "stateless" prometheus remote-writers and system's like Grafana's adaptive metrics that can do some limited metric aggregation etc without a full recording rule engine.

Why is this needed:

Improved UX. You can immediately get things like kube_pod_container_status_restarts_total categorized by a workload without having to jump into recording rules.
Improved performance in 3rd party systems with adaptive metrics or similar solutions.

Describe the solution you'd like

A flag that basically says - take this pod label and propagate it to all kube_pod_* time series' without requiring a label join.

Additional context

Potentially, it would be desirable to do this on individual kube_pod_ time series, that does stretch the command line arguments approach though and it might just make sense to leverage the existing allow list and simply have a secondary boolean to enable / disable the label (and I suppose annotation for symmetry...) propagation.
Ideally this is a benign if you don't opt in, if you do I'd expect additional memory usage for each ts to propagate the additional bytes for the label. The whole theory here is that when you have 10k pods that belong to "temporal" or "flyte" the subsequent ease with which you can do aggregations using something like https://grafana.com/blog/2023/05/09/adaptive-metrics-grafana-cloud-announcement/ would make it worthwhile.
I am indeed largely advocating for this from the POV of adaptive metrics, and if that technology was more robust maybe the problem could be pushed downstream into having Grafana allow for custom rule expression that allow for label joins... but that's kind of the heart of the ux issue here: having to do label joins affects everything and is so fundamental that doing it closer-to-the source would obviate a lot of downstream work.

kubernetes / kube-state-metrics

Propagate select pod labels to all metrics without requiring promql metric joins. #2551