PreStop Hook exited with 137 blocking clean `kubectl delete pod`

smoke commented 4 years ago

Using the following command stucks for too much time:

smoke@rkirilov-work-pc ~ $ kubectl delete pod -n ci concourse-ci-worker-0 
pod "concourse-ci-worker-0" deleted

When I describe the POD it is clear that the PreStop Hook did not exit clean:

smoke@rkirilov-work-pc ~ $ kubectl describe pod -n ci concourse-ci-worker-0 | cat | tail -n 12
Events:
  Type     Reason             Age   From                                  Message
  ----     ------             ----  ----                                  -------
  Normal   Scheduled          79s   default-scheduler                     Successfully assigned ci/concourse-ci-worker-0 to ip-10-200-3-38.ec2.internal
  Normal   Pulled             78s   kubelet, ip-10-200-3-38.ec2.internal  Container image "concourse/concourse:5.8.0" already present on machine
  Normal   Created            78s   kubelet, ip-10-200-3-38.ec2.internal  Created container concourse-ci-worker-init-rm
  Normal   Started            78s   kubelet, ip-10-200-3-38.ec2.internal  Started container concourse-ci-worker-init-rm
  Normal   Pulled             72s   kubelet, ip-10-200-3-38.ec2.internal  Container image "concourse/concourse:5.8.0" already present on machine
  Normal   Created            72s   kubelet, ip-10-200-3-38.ec2.internal  Created container concourse-ci-worker
  Normal   Started            72s   kubelet, ip-10-200-3-38.ec2.internal  Started container concourse-ci-worker
  Normal   Killing            54s   kubelet, ip-10-200-3-38.ec2.internal  Stopping container concourse-ci-worker
  Warning  FailedPreStopHook  11s   kubelet, ip-10-200-3-38.ec2.internal  Exec lifecycle hook ([/bin/bash /pre-stop-hook.sh]) for Container "concourse-ci-worker" in Pod "concourse-ci-worker-0_ci(8688f7aa-6444-11ea-9917-0ad140727ba9)" failed - error: command '/bin/bash /pre-stop-hook.sh' exited with 137: , message: ""

So the only workaround is to now force delete the pod:

smoke@rkirilov-work-pc ~ $ kubectl delete pod --force --grace-period=0 -n ci concourse-ci-worker-0 
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "concourse-ci-worker-0" force deleted

May be /pre-stop-hook.sh should be patched to handle (trap) the relevant signals (e.g. SIGTERM, SIGINT, SIGHUP) and exit cleanly. I assume when the dumb-init is signaled, it on its own tries to cleanly terminate the /pre-stop-hook.sh and given it does not terminate cleanly - it gets killed with the exit code 137 that then blocks K8S.

~~I will give it a try and will update the ticket, hopefully with a PR.~~

Actually K8S just waits for the PreStop Hook only for a terminationGracePeriodSeconds amount of time and then sends a SIGTERM the containers and then SIGKILL all the running processes after 2 more seconds as per https://github.com/kubernetes/kubernetes/issues/39170#issuecomment-448195287 and https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods

However strange thing is the POD is left in terminating state for many more minutes and doesn't seem to restart.

So may be the best course of action would be to use timeout -k {.Values.worker.terminationGracePeriodSeconds} bash -c 'while [ -e /proc/1 ]; do sleep 1; done' or something similar I guess. This way at least the delete command will not be blocked.

Also it is important to increase the .Values.worker.terminationGracePeriodSeconds to something that makes sense for your own Pipelines.

taylorsilva commented 4 years ago

I tried a quick patch with your suggestion:

diff --git a/templates/worker-prestop-configmap.yaml b/templates/worker-prestop-configmap.yaml
index 9d5dd31..9f43a76 100644
--- a/templates/worker-prestop-configmap.yaml
+++ b/templates/worker-prestop-configmap.yaml
@@ -11,5 +11,5 @@ data:
   pre-stop-hook.sh: |
     #!/bin/bash
     kill -s {{ .Values.concourse.worker.shutdownSignal }} 1
-    while [ -e /proc/1 ]; do sleep 1; done
+    timeout -k {{ .Values.worker.terminationGracePeriodSeconds }} {{ .Values.worker.terminationGracePeriodSeconds }} /bin/bash -c 'while [ -e /proc/1 ]; do sleep 1; done'

The script still exits with a non-zero exit code, 124 in this case:

Warning  FailedPreStopHook       1s     kubelet, gke-topgun-topgun-worker-2c49df4e-qwh6  Exec lifecycle hook ([/bin/bash /pre-stop-hook.sh]) for Container "issue81-worker" in Pod "issue81-worker-0_issue81(4ad690c9-d362-48d8-9e5a-c5e873b5571e)" failed - error: command '/bin/bash /pre-stop-hook.sh' exited with 124: , message: ""

Not sure what a good solution for this one is 🤔

to reproduce this I installed the helm chart with default settings and started this long running job:

---
jobs:
  - name: simple-job
    plan:
      - task: simple-task
        config:
          platform: linux
          image_resource:
            type: registry-image
            source: {repository: busybox}
          run:
            path: /bin/sh
            args:
              - -c
              - |
                #!/bin/sh
                sleep 1h

I then deleted the pod

$ kubectl delete pod -n issue81 issue81-worker-0

and kept describing the pod until I saw the relevant error:

$ k describe pod -n issue81 issue81-worker-0 | tail -n 10

smoke commented 4 years ago

@taylorsilva I confirm your findings and I don't have better workaround than increase the timeout and manually intervening when such things happen :(

skreddy6673 commented 4 years ago

Having same issue on Concourse v5.7.1.

vineethNaroju commented 1 year ago

Hi, I've the same error. I attached a pre stop hook script, containing sleep 10 seconds and deleted the pod. The pre-stop hook script ran but got FailedPreStopHook event with same exit code 137. This is in EKS with 1.25 k8s version.

concourse / concourse-chart

PreStop Hook exited with 137 blocking clean `kubectl delete pod` #81