kubernetes / kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/
Apache License 2.0
5.2k stars 1.92k forks source link

Purpose of `kube_pod_status_ready{condition="unknown"}` #2380

Open sidewinder12s opened 2 months ago

sidewinder12s commented 2 months ago

What happened:

What is the purpose of the unknown condition on kube_pod_status_ready metric?

kube_pod_status_ready{condition="unknown"} == 1

Searching through my metrics for a fairly high number of pods and nodes, I don't see that status ever being true.

So my initial impression is this is needlessly wasting cardinality in a way that is very hard to avoid.

Anything else we need to know?:

Environment:

ricardoapl commented 2 months ago

As far as I can tell that seems to (still) be a valid value for a PodCondition. You can find (a little bit) more about it in Pod conditions.

sidewinder12s commented 2 months ago

I was suspecting that, just kind of surprised I hadn't seen that status actually in any of our metrics as we get containers in that state fairly regularly.

It's unfortunate KSM exposes these like this, the cardinality caused by it is extreme.

ricardoapl commented 2 months ago

It's unfortunate KSM exposes these like this, the cardinality caused by it is extreme.

On the one hand, I don't think it would be wise to omit such label value given that it seems to still be a valid status. On the other hand, I can see how this might not be of interest and could impact cardinality. I wonder if we could have some kind of metric cardinality enforcement mechanism like we have for kube-apiserver, though I'm not sure if it would be too much just for this one metric/label. There are probably better ways to solve this, though I can't think of anything right now.

I was suspecting that, just kind of surprised I hadn't seen that status actually in any of our metrics as we get containers in that state fairly regularly.

You mean you often have containers with unknown Ready condition status, yet they don't show up as such in your metrics?

sidewinder12s commented 2 months ago

KSM has a lot of the status metrics that are similar to this one where it has many label + value combinations for each pod that can explode cardinality and cost, especially if you are using a metrics vendor which bills by the cardinality/active series. So I think some broader cardinality enforcement might be broadly useful.

As far as Ready Status, I might be mixing ContainerStatusUnknown that I see on pods when I run kubectl get pods for example, with Ready status which indeed might be different.

jwilder commented 2 months ago

The metrics that have conditions are some are our most problematic ones as well. I've been thinking of two ways to address them:

  1. For the True/False/Unknown cases like kube_pod_status_ready, return a value -1=Unknown, 0=False, 1=True and remove the "condition" label.
  2. Only emit the labels/series for the True combination and the condition is actually True. This would help us for metrics like kube_node_status_condition and kube_pod_status_phase where we have many nodes conditions/pods phases, but vast majority are always 0 and we only really care about conditions that are True.
ricardoapl commented 1 month ago

For the True/False/Unknown cases like kube_pod_status_ready, return a value -1=Unknown, 0=False, 1=True and remove the "condition" label.

Some of these metrics are marked as stable, so removing a label might not be something we would want to do. However, I can't tell if kube-state-metrics really follows the Kubernetes metrics stability framework.

jwilder commented 1 month ago

Ah. Another option might be:

  1. Leave existing metrics as is for stability, create new ones in the right shape and provide an option to disable the old ones if a user chooses. There is the metric allow/deny list that could be used to not expose the stable metrics. Would be nice to not even create them in memory just to drop them though.
sidewinder12s commented 1 month ago

For 2, one of our metrics vendors supports dropping data based on metric value, which we've implemented for many of our high cardinality metrics that are mostly zeros.

So far, many of them tend to fall into 2 camps:

  1. An error/count metric that most of the time is zero and any increase indicates a problem i. network errors/drops being a good example
  2. A state metric, KSM being a huge user of it that indicates binary state through the metric value (and potentially in combination with label value changes) i. All the status metrics

When we were adding these drops, the main drawback was that any gauges/math might not work without the zero value, but can be worked around.

For 3, I had wondered if anyone had proposed/questioned how overloaded some of these status metrics become with the # of labels/label combinations that end up being created. I don't really think there'd be huge downside to breaking these up into individual metrics vs the current solution.

Would another option be to have a runtime flag to just not emit the metrics if they are zero?

jwilder commented 1 month ago

Dropping zeros would be very useful. If there was an option to enable for KSM, that would cut out about ~55% of the series that we don't use.

I don't really think there'd be huge downside to breaking these up into individual metrics vs the current solution.

Agree. A metric like kube_pod_status_ready has the same information encoded twice. One series with condition=true and value 1, is the same as condition=false and value 0.

kube_pod_status_phase could be broken out to kube_pod_status_phase_pending, kube_pod_status_phase_succeeded, kube_pod_status_phase_failed and kube_pod_status_phase_running. With ability to drop 0 values, that would also cut out a lot excess data. This structure would also work better with the denylist to be more fine grained in what metrics are kept.

sidewinder12s commented 1 month ago

I do wonder if there is some use case to this structure we might be missing. But at scale the way these metrics are laid out is egregious.

logicalhan commented 1 month ago

/triage accepted /kind support

logicalhan commented 1 month ago

/remove-kind bug

logicalhan commented 1 month ago

/assign @dgrisonnet