colebrooke / kubernetes-nagios

Basic health checks for a Kubernetes cluster
MIT License
50 stars 44 forks source link

In pod deployments we are interested only in current condition #18

Closed ondrejholecek closed 3 years ago

ondrejholecek commented 3 years ago

Before this change I got "unknown" status for some pods:

# ./check_kube_deployments.sh -n kube-system
Unknown: coredns has condition Progressing
Available: True
True - ReplicaSet "coredns-74ff55c5b" has successfully progressed.
Deployment has minimum availability.

# echo $?
3

This is because there are two "historical" conditions - one emitted at "13:22:06" saying "Progressing" and then a newer one at "17:57:55" saying "Available". However, "check_kube_deployments.sh" took the older one:

# kubectl get deployment coredns -n kube-system -o yaml

[...]
status:
  availableReplicas: 2
  conditions:
  - lastTransitionTime: "2021-02-25T13:22:06Z"
    lastUpdateTime: "2021-02-25T13:25:05Z"
    message: ReplicaSet "coredns-74ff55c5b" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2021-02-25T17:57:55Z"
    lastUpdateTime: "2021-02-25T17:57:55Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 1
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

After this change, the conditions are sorted from oldest to newest and code takes only the last (newest) status:

# ./check_kube_deployments.sh -n kube-system
OK - Kubernetes deployments are all OK
OK: coredns has condition Available: True - Deployment has minimum availability.

# echo $?
0
colebrooke commented 3 years ago

Thanks for this. 👍