Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
Situation:
I have an AWS hosted k8s test cluster up and running, we're using it for integration tests for our regular environment. It is, AFAIK, fully functional.
I have built a set of executor, driver, and spark-init images.
I can run my job on minikube where I don't need to mess with serviceAccounts.
In my test cluster however I do need to provide credentials or service accounts.
I am using kubectl proxy and my spark-submit looks like this:
The pod spec that gets generated shows the following:
serviceAccountName:default
serviceAccount:default
In my pod log I get:
2017-09-16 22:32:10 ERROR KubernetesClusterSchedulerBackend:91 - Executor cannot find driver pod.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-pi-1505601113438-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:default:default" cannot get pods in the namespace "default"..
Situation: I have an AWS hosted k8s test cluster up and running, we're using it for integration tests for our regular environment. It is, AFAIK, fully functional.
I have built a set of executor, driver, and spark-init images.
I can run my job on minikube where I don't need to mess with serviceAccounts.
In my test cluster however I do need to provide credentials or service accounts.
I am using kubectl proxy and my spark-submit looks like this:
The pod spec that gets generated shows the following:
In my pod log I get:
Kubernetes version:
I've seen mention of issues here: https://github.com/apache-spark-on-k8s/spark/issues/448 but that looks like it was merged awhile back.
You can see from my spark-submit call the version i'm on currently of spark-k8s
./infra/spark-2.2.0-k8s-0.3.0-bin-2.7.3