kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.79k stars 1.38k forks source link

Configuring driver.podName causes Pod already exists error in `ScheduledSparkApplication` #300

Open spjegan opened 6 years ago

spjegan commented 6 years ago

Configuring driver.podName causes Pod already exists error when the job gets triggered second time.

One way to solve this (as per the discussion in the Slack channel) is, in case of a scheduled job, the configured podName could be treated as a prefix and the trigger could be appended to the podName prefix.

The work-around is to remove the driver.podName.

E0928 21:20:27.031886 1 submission_runner.go:93] failed to run spark-submit for SparkApplication gw-agg-1h-1538169624818944620 in namespace insights: Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.20.0.1/api/v1/namespaces/insights/pods. Message: pods "gw-agg-hourly" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=pods, name=gw-agg-hourly, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods "gw-agg-hourly" already exists, metadata=ListMeta(resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:470) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:409) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379)

liyinan926 commented 6 years ago

One potential solution is to treat driver.podName as a prefix that the ScheduledSparkApplication controller will use to construct the actual driver pod name, with a suffix that differs it between different runs.

samarthkansal commented 4 years ago

has anyone been able to resolve this issue ? i am facing this too

ghost commented 2 years ago

we are facing the similar issue now, any progress on this thread, or should we upgrade kubernetes-client version and use createOrReplace method for the pod creation ?

github-actions[bot] commented 4 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.