Closed jpdstan closed 2 years ago
This could be related to https://github.com/kubernetes/kube-state-metrics/issues/694
Have you checked via kubectl that the pods in this state are actually deleted, and not in some non running state, such as Completed or Evicted?
@fredr Yes, they are definitely deleted.
Same thing happening to me on EKS
Seeing another instance of this. These two metrics existed at the same time for the pod named taskmanager-0
... the IP addresses differ because one IP is old one and the other IP is current one.
kube_pod_labels{
host="1.1.147.202"
instance="1.1.147.202:9102"
job="kubernetes-pods-k8s-production"
kubernetes_namespace="kube-system"
kubernetes_pod_name="kube-state-metrics-4"
pod="taskmanager-0"
...
}
kube_pod_labels{
host="1.1.188.37"
instance="1.1.188.37:9102"
job="kubernetes-pods-k8s-production"
kubernetes_namespace="kube-system"
kubernetes_pod_name="kube-state-metrics-8"
pod="taskmanager-0"
...
}
Happens to me with kube_pod_container_resource_requests and "Terminated" pods (but not yet removed by terminated pod garbage collector). KSM version: kube-state-metrics/kube-state-metrics:v2.4.1 I would expect that kube_pod_container_resource_requests would not return terminated pods (or at least expect them correctly labelled so I can filter them).
This case is expected since KSM exposes everything from the apiserver. If you are not interested in terminated pods, you can drop the series using relabeling.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
What happened:
it seems that sometimes metrics don't get deleted alongside the pod. It isn't until we churn all the kube-state-metrics pods that it fixes it.
What's even stranger is that it won't be all metrics for that pod that will incorrectly exist; for example, for a particular pod that was deleted, we noticed that it was still reporting
kube_pod_container_status_waiting_reason
, but notkube_pod_container_resource_requests
.What you expected to happen:
When a pod gets deleted, all metrics associated with that pod should also be deleted.
How to reproduce it (as minimally and precisely as possible):
It's unclear as to how this happens - whenever we try to reproduce by manually deleting a pod and querying for all its metrics (
{pod="my_pod"}
), it seems to work just fine, i.e. the metrics all disappear.Anything else we need to know?:
Environment:
kubectl version
):