kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.77k stars 1.38k forks source link

preStop hook got interrupted and graceful termination failed #1049

Open FloraZhang opened 3 years ago

FloraZhang commented 3 years ago

Hi masters,

I'm trying out preStop hook supported by latest spark-operator, and I noticed the preStop hook failed if I run some time-consuming task or simply sleep for a while before pod terminated. The chart spec is the same as in user-guide:

spec:
  driver:
    lifecycle:
      preStop:
        exec:
          command:
          - /bin/bash
          - -c
          - touch /var/run/killspark && sleep 65

Detailed error message are as below:

[root@wzhan-node1 opt]# kubectl describe pod spark-app-driver
  Warning  FailedPreStopHook  1s         kubelet, rob-ha-local-node6  Exec lifecycle hook ([/bin/bash -c touch /var/run/killspark && sleep 65
' exited with 137: , message: ""

Starting from stopping driver container, it took about 30s before showing FailedPreStopHook, setting terminationGracePeriodSeconds to 60 sec didn't help either.

Any idea why this is happening?

Thanks, Wenjing

gibchikafa commented 10 months ago

Hi, I am facing the same issue. Did you get a solution?

github-actions[bot] commented 5 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.