agapoff / check_kubernetes

Nagios/Icinga/Zabbix style plugin for checking Kubernetes
64 stars 36 forks source link

Reset max_restart_count for every container #40

Closed rndmh3ro closed 6 months ago

rndmh3ro commented 6 months ago

The max_restart_count-variable is not reset when iterating over the pods:

A Pod restarted three times:

> k get pods -n foo
NAME                             READY   STATUS    RESTARTS      AGE
grove-0                          1/1     Running   3 (46h ago)   46h

When checking the namespace directly, the check returns the correct result:

> './check_kubernetes.sh' '-H' 'https://xxxxxxxxxxx.io:443' '-T' $TOKEN '-c' '10' '-m' 'pods' '-w' '1' -N foo
WARNING. 25 pods ready, 0 pods succeeded, 0 pods not ready
Container foo/bar/grove: 3 restarts.

However when running on all namespaces, where other restarted pods are found, the grove-pod isn't displayed:

> './check_kubernetes.sh' '-H' 'https://xxxxxxxxxxx.io:443' '-T' $TOKEN '-c' '10' '-m' 'pods' '-w' '1'
WARNING. 84 pods ready, 7 pods succeeded, 0 pods not ready
Container kube-system/microsoft-defender-collector-ds-jbj8n/microsoft-defender-pod-collector: 1 restarts.
Container mgmt/defectdojo-django-7879d4d8d8-s4qr8/uwsgi: 5 restarts.
Container mgmt/defectdojo-django-7879d4d8d8-s4qr8/uwsgi: 3 restarts.

With the proposed change, it works:

> './check_kubernetes.sh' '-H' 'https://xxxxxxxxxxxx.io:443' '-T' $TOKEN '-c' '10' '-m' 'pods' '-w' '1'
WARNING. 84 pods ready, 7 pods succeeded, 0 pods not ready
Container kube-system/microsoft-defender-collector-ds-jbj8n/microsoft-defender-pod-collector: 1 restarts.
Container mgmt/defectdojo-django-7879d4d8d8-s4qr8/uwsgi: 5 restarts.
Container foo/bar/grove: 3 restarts.
agapoff commented 6 months ago

Thank you!