apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Kubernetes resources created for an application should be deleted when the application finishes #519

Open liyinan926 opened 7 years ago

liyinan926 commented 7 years ago

483 started to create a headless service for the driver pod. However, this service stays around after an application finished and must be deleted manually. I think the submission client should instead be responsible for deleting the service automatically when spark.kubernetes.submission.waitAppCompletion is true. This also applies to other Kubernetes resources, such as the secret for small files shipped via spark.files.

For example, running kubectl get services gave the following output, although the application pubsub-wordcount has finished a day before.

NAME                                        CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
kubernetes                                  10.0.0.1     <none>        443/TCP             1d
pubsub-wordcount-1507066935952-driver-svc   None         <none>        7078/TCP,7079/TCP   1d
mccheah commented 7 years ago

The service should have an owner reference that ties back to the driver pod. So when the driver pod is deleted from the cluster, the service should go down with it. Can you post the result of kubectl get service <driver-service-name> -n <driver-namespace>?

foxish commented 7 years ago

Although the owner-ref and GC will kick in when the pod is deleted, we can actually delete the service earlier if the submission client is waiting and sees that the driver pod has completed.

liyinan926 commented 7 years ago

Running kubectl get service pubsub-wordcount-1507227999031-driver-svc -o=yaml:

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2017-10-05T18:26:40Z
  name: pubsub-wordcount-1507227999031-driver-svc
  namespace: default
  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Pod
    name: pubsub-wordcount-1507227999031-driver
    uid: baed290a-a9fa-11e7-b535-08002703730e
  resourceVersion: "175448"
  selfLink: /api/v1/namespaces/default/services/pubsub-wordcount-1507227999031-driver-svc
  uid: bb0165fd-a9fa-11e7-b535-08002703730e
spec:
...

I do see the ownerReference though. The driver finished successfully, leaving the driver pod in a Completed status. But this won't cause the gc to kick-in and delete the service, if I under how gc works correctly.

mccheah commented 7 years ago

The driver pod has to be deleted in order for the dependent objects to be destroyed. Right now we don't delete the pod once the application finishes. It's useful to keep the pod around to allow users to debug and collect logs before completely destroying it, but maybe we should make it an option to attempt to have the driver attempt to delete itself upon completion.

liyinan926 commented 7 years ago

Yes, it's definitely useful for the driver pod to stay around. Or we could have the submission client delete all Kubernetes resources it created for the application to run upon application completion. This guarantees that those resources get cleaned up regardless of if the driver pod is deleted or not.

mccheah commented 7 years ago

The submission client can have "fire and forget" semantics though which means the submission client doesn't have to remain running after the driver starts running. In that case only the driver pod can be responsible for managing its Kubernetes resources.

liyinan926 commented 7 years ago

Then the best we can do is probably make it an option for the driver pod to delete itself upon stopping. The default grace period 30 seconds should be enough for the driver to clean up and terminate.

foxish commented 7 years ago

I think there is always value in keeping the driver pod around for logs and having its lifetime be controlled by a user. I'd rather we have a "not-fire-and-forget" mode in which the submission client cleans up everything except the driver. I thought spark.kubernetes.submission.waitAppCompletion is that flag, which can indicate that we want to do cleanup (of everything except the pod) after the job completes.

liyinan926 commented 7 years ago

Yes, for non-fire-and-forget, we are covered if the submission client deletes the resources. The problem is in the fire-and-forget case, users likely would be surprised to discover resource leakage if the driver pod is still around, and some resources such as the secret for small files and the headless service are supposed to be internal.

liyinan926 commented 7 years ago

@foxish made a good point, the driver will need the right rbac roles to be able to delete the resources. It would make much more sense for the submission client to clean things up as it already has the permissions to do so.