apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

spark-submit ignoring : spark.kubernetes.authenticate.driver.serviceAccountName #633

Closed purpletech77 closed 4 years ago

purpletech77 commented 6 years ago
~/spark-2.3.0-bin-hadoop2.7 # bin/spark-submit \
>     --master k8s://kubecluster:443 \
>     --deploy-mode cluster \
>     --name spark-pi \
>     --conf spark.kubernetes.test.serviceAccountName=default:spark \
>     --class org.apache.spark.examples.SparkPi \
>     --conf spark.executor.instances=5 \
>     --conf spark.kubernetes.driver.container.image=registry/spark-driver:latest \
>     --conf spark.kubernetes.executor.container.image=registry/spark-executor:latest \
>     local:///home/user/spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
2018-09-08 09:09:06 WARN  WatchConnectionManager:185 - Exec Failure: HTTP 403, Status: 403 - pods "spark-pi-c75ce20fff8539b3969926697eb6a78c-driver" is forbidden: User "system:anonymous" cannot watch pods in the namespace "default"
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
    at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: pods "spark-pi-c75ce20fff8539b3969926697eb6a78c-driver" is forbidden: User "system:anonymous" cannot watch pods in the namespace "default"
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
    at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
2018-09-08 09:09:06 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-09-08 09:09:06 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-fa5576a8-d883-4371-98de-69c2640991e5
rvesse commented 5 years ago

FYI - Spark on K8S has been merged upstream and is now being maintained as part of Apache Spark so issues should be reported on https://issues.apache.org/jira/


This is expected behaviour though I don't believe well documented.

The service account is only used for the driver and executor pods. However the submission client i.e. the local code running where you run spark-submit uses your own K8S config to monitor the ongoing progress of the driver and therefore needs to have sufficient permissions to do this.

waynegj commented 5 years ago

@purpletech77 , How did you got this issue solved in the end? I came across the same and got no clue.

rvesse commented 5 years ago

@waynegj As I tried to explain there is some monitoring of the ongoing progress of the driver pod that happens on the submission client i.e. the place where you run spark-submit. This uses your personal K8S config (typically ~/.kube/config or the file specified by the KUBECONFIG environment variable) so if the configured context there doesn't have the correct permissions the job monitoring will fail.

So the solution is to ensure that you have appropriate credentials in your local K8S config to be able to launch and monitor pods. How you get these credentials is a detail of your specific K8S cluster.

waynegj commented 5 years ago

@rvesse , Thanks for the sharing. After some digging I found this might be related to support of aws-iam-authenticator in io.fabric8.kubernetes-client(as adressed in https://github.com/fabric8io/kubernetes-client/pull/1224), The same error occurred both on Spark 2.3.1 and 2.3.2 if I configured everything correct. Can you shed some light on how to decide the fabric8 kubernetes-client being used in spark-submit?

rvesse commented 5 years ago

@waynegj Well that PR is very new and neither Spark 2.3.1/2.3.2 would have a version of Fabric 8 client that is remotely new enough to incorporate that change.

jeremyjjbrown commented 4 years ago

This whole spark-submit kuberentes is horribly busted. The 10K of code to deploy single driver container and it can't even log what went wrong.

mccheah commented 4 years ago

This repository is no longer used for tracking issues related to running Spark on Kubernetes. Please use the official Apache Spark JIRA project to report issues. Also, this project isn't the means to use this feature anymore - the official Spark releases from upstream are the way to do it.

Please move discussions to the official Apache channels. Thanks!