Closed jgagnon44 closed 2 months ago
Could you curl kube-state-metrics endpoint to get full metrics payload?
For example,
# change kube-system to correct namespace where kube-state-metrics exists
kubectl -n kube-system port-forward pod/kube-state-metrics-xxx 9443:9443
curl http://localhost:9443/metrics | grep kube_pod_status_reason
Just discovered that the kube-state-metrics pod is running in the lens-metrics namespace. Not sure what that means. Was expecting to see it in the namespace you mentioned.
In any case, tried to do as you suggested, but ran into a problem. I ran the port-forward command and then in a separate terminal entered the curl command. The curl command returns an empty reply and when I switched back to the other terminal, the port-forward had exited with an error.
$ k -n lens-metrics port-forward kube-state-metrics-775bd8fd9f-r5qgv 9443:9443
Forwarding from 127.0.0.1:9443 -> 9443
Forwarding from [::1]:9443 -> 9443
Handling connection for 9443
E0712 06:53:14.074707 38060 portforward.go:406] an error occurred forwarding 9443 -> 9443: error forwarding port 9443 to pod 45c511ca7caaa67f29593a81f9f974e8f5bcbc078d66216fd14ff9df01de5bf2, uid : port forward into network namespace "/var/run/netns/3ba1b4c7-1e76-4115-8d89-e98f8c3e3ec3": failed to connect to localhost:9443 inside namespace 45c511ca7caaa67f29593a81f9f974e8f5bcbc078d66216fd14ff9df01de5bf2: dial tcp 127.0.0.1:9443: connect: connection refused
E0712 06:53:14.076274 38060 portforward.go:234] lost connection to pod
I tried something else. In Lens, I accessed the pod and forwarded the port from there. It allowed me to open in a browser, whereupon I was able to access a metrics link. Clicking on this presented me with a page with tons of information. I did a search for the metric and found over 250 entries, each with a value zero.
Here's a sample:
kube_pod_status_reason{namespace="prometheus",pod="prometheus-prometheus-kube-prometheus-prometheus-0",reason="NodeLost"} 0
kube_pod_status_reason{namespace="prometheus",pod="prometheus-prometheus-kube-prometheus-prometheus-0",reason="Evicted"} 0
kube_pod_status_reason{namespace="prometheus",pod="prometheus-prometheus-kube-prometheus-prometheus-0",reason="UnexpectedAdmissionError"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28146870-w2nbx",reason="NodeLost"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28146870-w2nbx",reason="Evicted"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28146870-w2nbx",reason="UnexpectedAdmissionError"} 0
kube_pod_status_reason{namespace="kubeapps",pod="kubeapps-7db5f76cb5-qt2jq",reason="NodeLost"} 0
kube_pod_status_reason{namespace="kubeapps",pod="kubeapps-7db5f76cb5-qt2jq",reason="Evicted"} 0
kube_pod_status_reason{namespace="kubeapps",pod="kubeapps-7db5f76cb5-qt2jq",reason="UnexpectedAdmissionError"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28148310-sk282",reason="NodeLost"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28148310-sk282",reason="Evicted"} 0
kube_pod_status_reason{namespace="deploy",pod="clean-deploy-cronjob-28148310-sk282",reason="UnexpectedAdmissionError"} 0
Could you list pods which are in failed or pending status? Also list the reasons.
It might be because current implementations only include these three reasons {NodeLost, Evicted, UnexpectedAdmissionError}.
This is interesting. If I query this metric in Prometheus, I get 5 result entries per pod. If I port forward and access the kube-state-metrics /metrics, I'm only getting 3 result entries per pod. Here's an example:
From kube-state-metrics:
kube_pod_status_reason{namespace="ingress-nginx",pod="ingress-nginx-controller-f5rmk",reason="NodeLost"} 0
kube_pod_status_reason{namespace="ingress-nginx",pod="ingress-nginx-controller-f5rmk",reason="Evicted"} 0
kube_pod_status_reason{namespace="ingress-nginx",pod="ingress-nginx-controller-f5rmk",reason="UnexpectedAdmissionError"} 0
From Prometheus:
kube_pod_status_reason{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="ingress-nginx", pod="ingress-nginx-controller-f5rmk", reason="Evicted", service="prometheus-kube-state-metrics", uid="29cd7cc2-86d4-4a9d-854e-1e122af3403b"} | 0
kube_pod_status_reason{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="ingress-nginx", pod="ingress-nginx-controller-f5rmk", reason="NodeAffinity", service="prometheus-kube-state-metrics", uid="29cd7cc2-86d4-4a9d-854e-1e122af3403b"} | 0
kube_pod_status_reason{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="ingress-nginx", pod="ingress-nginx-controller-f5rmk", reason="NodeLost", service="prometheus-kube-state-metrics", uid="29cd7cc2-86d4-4a9d-854e-1e122af3403b"} | 0
kube_pod_status_reason{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="ingress-nginx", pod="ingress-nginx-controller-f5rmk", reason="Shutdown", service="prometheus-kube-state-metrics", uid="29cd7cc2-86d4-4a9d-854e-1e122af3403b"} | 0
kube_pod_status_reason{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="ingress-nginx", pod="ingress-nginx-controller-f5rmk", reason="UnexpectedAdmissionError", service="prometheus-kube-state-metrics", uid="29cd7cc2-86d4-4a9d-854e-1e122af3403b"} | 0
Why would I get a different set of results?
fyi: Pod metrics are implemented in this file https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/pod.go
You can have a look at this file to figure out what happened.
I am curious on whether kube_pod_status_reason didn't reports failed pods.
Here are the current failed or pending pods on the cluster:
Pod | Namespace | Status | Reason |
---|---|---|---|
runner-goew1uzh-project-8-concurrent-0zk79h | runner-workspace-writer | Failed | ContainerStatusUnknown - exit code: 137 |
runner-goew1uzh-project-8-concurrent-1fhtb4 | runner-workspace-writer | Failed | ContainerStatusUnknown - exit code: 137 |
clean-deploy-cronjob-28151190-fqqgs | deploy | Pending | Back-off pulling image "harbor.hulk.beast-code.com/phactory-images/cleanup-deploy:master" |
clean-deploy-cronjob-28152630-jzrsj | deploy | Pending | Back-off pulling image "harbor.hulk.beast-code.com/phactory-images/cleanup-deploy:master" |
I found the reason. It's because using bounded reasons.
podStatusReasons = []string{"Evicted", "NodeAffinity", "NodeLost", "Shutdown", "UnexpectedAdmissionError"}
https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/pod.go#L40
Is it possible to reconfigure this to be non constrained?
Could you try updating https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/pod.go and rebuilding a new kube-state-metrics?
Wondering what's the reason for Back-off pulling image "harbor.hulk.beast-code.com/phactory-images/cleanup-deploy:master"
Or could you run kubectl -n deploy get pod clean-deploy-cronjob-28152630-jzrsj -o yaml
I'm guessing this is unrelated. In any case:
containerStatuses:
- image: harbor.hulk.beast-code.com/phactory-images/cleanup-deploy:master
imageID: ""
lastState: {}
name: clean-deploy
ready: false
restartCount: 0
started: false
state:
waiting:
message: Back-off pulling image "harbor.hulk.beast-code.com/phactory-images/cleanup-deploy:master"
reason: ImagePullBackOff
Not sure how to attempt what you suggest RE updating the .go file and rebuilding. Also, I don't know Go and am not sure what can be done to make it do a wildcard loop where the variable is used:
I guess a more important question is: why is it being constrained in the first place?
I guess a more important question is: why is it being constrained in the first place?
Not sure all the reasons. One reason should be constraining metrics cardinality.
I think we can add ImagePullBackOff into the list in master branch. Could you list reason forrunner-goew1uzh-project-8-concurrent-0zk79h
as well?
kubectl -n runner-workspace-writer get pod runner-goew1uzh-project-8-concurrent-0zk79h -o yaml
Both containers in that pod show:
state:
terminated:
exitCode: 137
finishedAt: null
message: The container could not be located when the pod was terminated
reason: ContainerStatusUnknown
startedAt: null
The same for the other failed pod as well.
I'm curious about the fact that the kube-state-metrics pod is deployed in the lens-metrics namespace instead of kube-system. This is the only instance of a pod with this name that I can see. Is kube-state-metrics normally deployed when a K8s cluster is set up? I wonder how this cluster got into this configuration.
I'm curious about the fact that the kube-state-metrics pod is deployed in the lens-metrics namespace instead of kube-system.
It's fine that kube-state-metrics pod is deployed in the lens-metrics namespace.
Is kube-state-metrics normally deployed when a K8s cluster is set up?
k8s cluster can run without kube-state-metrics. kube-state-metrics is a monitoring addon for k8s cluster.
For example:
kube-state-metrics isn't bundled in k8s cluster (https://github.com/kubernetes/kubernetes). It needs to be installed after cluster is created.
Could you see these pods in metric kube_pod_status_phase?
kube_pod_status_reason only covers limited reasons.
kube_pod_status_phase should cover all pods.
Could you see these pods in metric kube_pod_status_phase?
Yes. runner-goew1uzh-project-8-concurrent-0zk79h and runner-goew1uzh-project-8-concurrent-1fhtb4 show phase=Failed clean-deploy-cronjob-28151190-fqqgs and clean-deploy-cronjob-28152630-jzrsj show phase=Pending
kube_pod_status_phase is only showing five statuses: Failed, Pending, Running, Succeeded and Unknown, with a value of 1 for each pod that is in a particular status.
E.g.:
Query: kube_pod_status_phase{pod="runner-goew1uzh-project-8-concurrent-1fhtb4"}
Result:
Metric | Value |
---|---|
kube_pod_status_phase{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="runner-workspace-writer", phase="Failed", pod="runner-goew1uzh-project-8-concurrent-1fhtb4", service="prometheus-kube-state-metrics", uid="0dd41342-cdb8-4d33-b1ff-014cb1bab97b"} | 1 |
kube_pod_status_phase{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="runner-workspace-writer", phase="Pending", pod="runner-goew1uzh-project-8-concurrent-1fhtb4", service="prometheus-kube-state-metrics", uid="0dd41342-cdb8-4d33-b1ff-014cb1bab97b"} | 0 |
kube_pod_status_phase{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="runner-workspace-writer", phase="Running", pod="runner-goew1uzh-project-8-concurrent-1fhtb4", service="prometheus-kube-state-metrics", uid="0dd41342-cdb8-4d33-b1ff-014cb1bab97b"} | 0 |
kube_pod_status_phase{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="runner-workspace-writer", phase="Succeeded", pod="runner-goew1uzh-project-8-concurrent-1fhtb4", service="prometheus-kube-state-metrics", uid="0dd41342-cdb8-4d33-b1ff-014cb1bab97b"} | 0 |
kube_pod_status_phase{container="kube-state-metrics", endpoint="http", instance="10.244.211.138:8080", job="kube-state-metrics", namespace="runner-workspace-writer", phase="Unknown", pod="runner-goew1uzh-project-8-concurrent-1fhtb4", service="prometheus-kube-state-metrics", uid="0dd41342-cdb8-4d33-b1ff-014cb1bab97b"} | 0 |
kube_pod_status_reason only covers limited reasons.
Seems to be very limited.
I have found that there is an instance of Prometheus running in the lens-metrics namespace (along with kube-state-metrics and several node-exporter pods). Separately, I have the kube-prometheus-stack (which includes Prometheus, Grafana and other apps) deployed in the prometheus namespace - this is the instance I am using for my work. I wonder what implications (if any) there might be with two Prometheus instances running. My Prometheus deployment has its own set of kube-state-metrics and node-exporter pods. Perhaps the pods in the lens-metrics namespace is only being used by Lens, while everything in the prometheus namespace is what I'm seeing?
Hi @dgrisonnet, do you remember reasons why we expose metrics with value = 0 in https://github.com/kubernetes/kube-state-metrics/issues/2116#issuecomment-1632678128
/triage accepted /assign @CatherineF-dev
I would make this metric opt-in only, and use kube_pod_aggregated_status_reason
for lower cardinality.
I would make this metric opt-in only, and use
kube_pod_aggregated_status_reason
for lower cardinality.
I get an empty result when I attempt to query this. There appears to be no metric by that name.
Hi @dgrisonnet, do you remember reasons why we expose metrics with value = 0 in https://github.com/kubernetes/kube-state-metrics/issues/2116#issuecomment-1632678128
From the code, a status with value 0 means that the pod isn't in this state whilst 1 means it is: https://github.com/kubernetes/kube-state-metrics/blob/main/internal/store/pod.go#L1475-L1485.
This allow two types of queries:
kube_pod_status_reason only covers limited reasons.
kube_pod_status_phase should cover all pods.
Hey the reasons indeed do seem very limited, is there a reason it's so limited?
It should include most reasons, if not we should add some more.
The reason why we keep a finite list of reasons is to avoid having an unbounded label in the metrics which could cause cardinality explosion issues in your monitoring backend.
Is it possible to add more reasons such as ContainerStatusUnknown, CreateContainerConfigError, and Error?
I believe these are container statuses so they are not fit to be part of kube_pod_status_reason.
That said I would be fine with creating a new metric called kube_pod_container_status
and adding the reasons there.
Could there maybe be a "Other" reason for anything that doesn't match one of the known reasons, so that there's always at least some kube_pod_status_reason
for a given pod that returns 1
at any given moment (at least if the pod's status is Failed/Pending)? This would prevent PromQL queries that are doing group_right
against kube_pod_status_reason
from randomly dropping out if you're trying to alert against it.
Assuming you have a promQL query along those lines:
kube_pod_status_phase{phase=~"Failed|Pending"} group_left on(reason) (kube_pod_status_reason > 0)
You can keep the result for which kube_pod_status_reason > 0
is not true by performing an outer join:
kube_pod_status_phase{phase=~"Failed|Pending"} group_left on(reason) (kube_pod_status_reason > 0) or kube_pod_status_phase{phase=~"Failed|Pending"}
Let me know if this solves your issue
Note that it should work the same way with group_right
, but I am too used to do it from the left.
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
/triage accepted
(org members only)/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
Closing, as we figured out why it was happening.
What happened: Running a PromQL query on the
kube_pod_status_reason
metric returns hundreds of result records, but every one has a value of zero. This occurs even with pods in failed or pending status.What you expected to happen: I would expect for cases where any pods in failed or pending status to have a result from this metric with a non-zero value.
How to reproduce it (as minimally and precisely as possible): Run a query in Prometheus on
kube_pod_status_reason
. No filtering is needed. I get hundreds of result entries, every one with a value of zero.If you run the query slightly differently:
kube_pod_status_reason > 0
you will get an empty response, even with pods in failed or pending status.Anything else we need to know?:
Environment:
Not sure how to find this.
I don't know if this is equivalent. I inspected the pod and found the following:
kubectl version
):