Closed severo closed 2 years ago
cc @XciD
NAME READY STATUS RESTARTS AGE
datasets-server-prod-admin-79798989fb-scmjw 1/1 Running 0 141m
datasets-server-prod-api-6f4477cc64-2tzn6 1/1 Running 0 141m
datasets-server-prod-api-6f4477cc64-6pjnq 1/1 Running 0 140m
datasets-server-prod-api-6f4477cc64-97gsc 1/1 Running 0 141m
datasets-server-prod-api-6f4477cc64-db6m8 1/1 Running 0 140m
datasets-server-prod-datasets-worker-776b774978-54zr6 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-7hw4j 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-cdb4b 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-cgtw2 1/1 Running 1 (20m ago) 97m
datasets-server-prod-datasets-worker-776b774978-cmth8 1/1 Running 0 97m
datasets-server-prod-datasets-worker-776b774978-d8m42 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-g7mpk 0/1 Error 0 97m
datasets-server-prod-datasets-worker-776b774978-m5dqs 1/1 Running 0 23m
datasets-server-prod-datasets-worker-776b774978-q29z6 1/1 Running 0 97m
datasets-server-prod-datasets-worker-776b774978-qtmtd 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-rxcb2 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-x7xzb 0/1 OutOfmemory 0 23m
datasets-server-prod-datasets-worker-776b774978-xx7hv 0/1 OutOfmemory 0 23m
NAME READY UP-TO-DATE AVAILABLE AGE
datasets-server-prod-admin 1/1 1 1 30d
datasets-server-prod-api 4/4 4 4 31d
datasets-server-prod-datasets-worker 4/4 4 4 31d
datasets-server-prod-reverse-proxy 2/2 2 2 31d
datasets-server-prod-splits-worker 56/56 56 56 31d
Some node reach a Pressure
condition, (memory or disk). When this happens, kubernetes will Evict some pod to lower the pressure.
OK, thanks. Is it normal that the pods marked as OutOfMemory (and Error) were still in the list? Is it for us to know that they crashed, instead of silently hide them? I had to terminate them using your magic command:
k get pod | grep OutOfmemory | cut -d ' ' -f 1 | xargs -I % kubectl delete pod/% --force
Yes, I think it's for you to know that you had a issue.
OK, nice.
By the way, about "Evicted":
When a node reaches out its disk or memory limit, a flag is set on the Kubernetes node to indicate that it is under pressure. This flag also blocks new allocation on this node, and following this, an eviction process is started to free some resources.