Closed rootsongjc closed 7 years ago
@rootsongjc, can you post the logs from the driver/executor pods?
@foxish There is no pod created. I can't figure out what happened from spark-submit logs.
I believe we might delete the driver pod entirely if we fail to submit the job. But at some point a driver pod should be created and eventually it will fail. Does a driver pod ever appear if we use this:
kubectl get pods -n <namespace> -w
Where the -w
flag makes it such that the pods can be followed as they are created and terminated?
@mccheah @foxish I found a few error pods by running the comman kubectl get pods --namespace spark-cluster
, here is error pod logs.
2017-09-05 08:54:41 INFO SparkContext:54 - Running Spark version 2.1.0-k8s-0.3.1-SNAPSHOT
2017-09-05 08:54:41 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-09-05 08:54:41 INFO SecurityManager:54 - Changing view acls to: root
2017-09-05 08:54:41 INFO SecurityManager:54 - Changing modify acls to: root
2017-09-05 08:54:41 INFO SecurityManager:54 - Changing view acls groups to:
2017-09-05 08:54:41 INFO SecurityManager:54 - Changing modify acls groups to:
2017-09-05 08:54:41 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2017-09-05 08:54:42 INFO Utils:54 - Successfully started service 'sparkDriver' on port 36433.
2017-09-05 08:54:42 INFO SparkEnv:54 - Registering MapOutputTracker
2017-09-05 08:54:42 INFO SparkEnv:54 - Registering BlockManagerMaster
2017-09-05 08:54:42 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2017-09-05 08:54:42 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2017-09-05 08:54:42 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-c13482cf-dde9-4e2b-a185-0bce58575e43
2017-09-05 08:54:42 INFO MemoryStore:54 - MemoryStore started with capacity 629.7 MB
2017-09-05 08:54:42 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2017-09-05 08:54:42 INFO log:186 - Logging initialized @1622ms
2017-09-05 08:54:42 INFO Server:327 - jetty-9.2.z-SNAPSHOT
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@25ddbbbb{/jobs,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@1536602f{/jobs/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@4ebea12c{/jobs/job,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@2a1edad4{/jobs/job/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@6256ac4f{/stages,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@44c79f32{/stages/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@7fcbe147{/stages/stage,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@235f4c10{/stages/stage/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@743cb8e0{/stages/pool,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@c7a975a{/stages/pool/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@2c1b9e4b{/storage,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@757d6814{/storage/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@649725e3{/storage/rdd,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@3c0fae6c{/storage/rdd/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@4c168660{/environment,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@52b56a3e{/environment/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@fd0e5b6{/executors,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@4eed46ee{/executors/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@36b0fcd5{/executors/threadDump,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@4fad94a7{/executors/threadDump/json,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@475835b1{/static,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@6326d182{/,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@5241cf67{/api,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@716a7124{/jobs/job/kill,null,AVAILABLE}
2017-09-05 08:54:42 INFO ContextHandler:744 - Started o.s.j.s.ServletContextHandler@77192705{/stages/stage/kill,null,AVAILABLE}
2017-09-05 08:54:42 INFO ServerConnector:266 - Started ServerConnector@2b58f754{HTTP/1.1}{0.0.0.0:4040}
2017-09-05 08:54:42 INFO Server:379 - Started @1741ms
2017-09-05 08:54:42 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2017-09-05 08:54:42 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://172.30.60.4:4040
2017-09-05 08:54:42 INFO SparkContext:54 - Added JAR /opt/spark/examples/jars/spark-examples_2.11-2.1.0-k8s-0.3.1-SNAPSHOT.jar at spark://172.30.60.4:36433/jars/spark-examples_2.11-2.1.0-k8s-0.3.1-SNAPSHOT.jar with timestamp 1504601682607
2017-09-05 08:54:42 WARN KubernetesClusterManager:66 - The executor's init-container config map was not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2017-09-05 08:54:42 WARN KubernetesClusterManager:66 - The executor's init-container config map key was not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2017-09-05 08:54:43 ERROR KubernetesClusterSchedulerBackend:91 - Executor cannot find driver pod.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark-cluster/pods/spark-pi-1504601675797-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:spark-cluster:default" cannot get pods in the namespace "spark-cluster"..
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:332)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:269)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:241)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:234)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:230)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:745)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:194)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:135)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:133)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:90)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2554)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
2017-09-05 08:54:43 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Executor cannot find driver pod
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:139)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:133)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:90)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2554)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark-cluster/pods/spark-pi-1504601675797-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:spark-cluster:default" cannot get pods in the namespace "spark-cluster"..
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:332)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:269)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:241)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:234)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:230)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:745)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:194)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:135)
... 11 more
2017-09-05 08:54:43 INFO ServerConnector:306 - Stopped ServerConnector@2b58f754{HTTP/1.1}{0.0.0.0:4040}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@77192705{/stages/stage/kill,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@716a7124{/jobs/job/kill,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@5241cf67{/api,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@6326d182{/,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@475835b1{/static,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@4fad94a7{/executors/threadDump/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@36b0fcd5{/executors/threadDump,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@4eed46ee{/executors/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@fd0e5b6{/executors,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@52b56a3e{/environment/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@4c168660{/environment,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@3c0fae6c{/storage/rdd/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@649725e3{/storage/rdd,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@757d6814{/storage/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@2c1b9e4b{/storage,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@c7a975a{/stages/pool/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@743cb8e0{/stages/pool,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@235f4c10{/stages/stage/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@7fcbe147{/stages/stage,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@44c79f32{/stages/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@6256ac4f{/stages,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@2a1edad4{/jobs/job/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@4ebea12c{/jobs/job,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@1536602f{/jobs/json,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO ContextHandler:865 - Stopped o.s.j.s.ServletContextHandler@25ddbbbb{/jobs,null,UNAVAILABLE}
2017-09-05 08:54:43 INFO SparkUI:54 - Stopped Spark web UI at http://172.30.60.4:4040
2017-09-05 08:54:43 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2017-09-05 08:54:43 INFO MemoryStore:54 - MemoryStore cleared
2017-09-05 08:54:43 INFO BlockManager:54 - BlockManager stopped
2017-09-05 08:54:43 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2017-09-05 08:54:43 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running
2017-09-05 08:54:43 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2017-09-05 08:54:43 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Executor cannot find driver pod
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:139)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:133)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:90)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2554)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark-cluster/pods/spark-pi-1504601675797-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:spark-cluster:default" cannot get pods in the namespace "spark-cluster"..
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:332)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:269)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:241)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:234)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:230)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:745)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:194)
at org.apache.spark.scheduler.cluster.kubernetes.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:135)
... 11 more
2017-09-05 08:54:43 INFO ShutdownHookManager:54 - Shutdown hook called
2017-09-05 08:54:43 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-582de7e3-49eb-43d8-818a-f1536a10031f
From this log, we can see two problems:
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark-cluster/pods/spark-pi-1504601675797-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:spark-cluster:default" cannot get pods in the namespace "spark-cluster"..
Looks like this is a service account permission issue. Kubernetes v 1.6 enables RBAC by default and the default service account does not have necessary permission for the driver. The driver needs "edit" privilege for constructing the executor pod spec.
You're right. We should update our docs to include the step to create the right RBAC permissions.
On Sep 6, 2017 8:23 AM, "Kimoon Kim" notifications@github.com wrote:
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark- cluster/pods/spark-pi-1504601675797-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. User "system:serviceaccount:spark-cluster:default" cannot get pods in the namespace "spark-cluster"..
Looks like this is a service account permission issue. Kubernetes v 1.6 enables RBAC by default and the default service account does not have necessary permission for the driver. The driver needs "edit" privilege for constructing the executor pod spec.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache-spark-on-k8s/spark/issues/478#issuecomment-327519310, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3U59RTqq8tCXijDFvWEVywM_fdSPnUks5sfrkGgaJpZM4PMidG .
A related fix is done by #451 that supports service account override. The documentation is also updated to show how a new service account can be created and used. But we never had new releases that include this fix. We may want to do that soon.
@foxish Should we have new releases? I'll be happy to try out release processes if you need help.
A bugfix release would certainly be a good idea. It would need to be for 2.1 (which our docs still point to) and 2.2. Would also be good to add a couple of statements to the documentation about this.
On Sep 6, 2017 8:27 AM, "Kimoon Kim" notifications@github.com wrote:
@foxish https://github.com/foxish Should we have new releases? I'll be happy to try out release processes if you need help.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/apache-spark-on-k8s/spark/issues/478#issuecomment-327520694, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3U57JRfVT12rddUcuNFs_PotugZAdFks5sfrn2gaJpZM4PMidG .
Which kubectl version are you using? The flag may be available in a newer version.
On Sep 6, 2017 9:03 PM, "Jimmy Song" notifications@github.com wrote:
@kimoonkim https://github.com/kimoonkim On kubernetes 1.6.0
kubectl -n spark-cluster create clusterrolebinding spark-edit --clusterrole=edit --serviceaccount=spark-cluster:spark Error: unknown flag: --clusterrole
No such flag --clusterrole.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache-spark-on-k8s/spark/issues/478#issuecomment-327677939, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3U506mvLWajaFQuHKxNmM3SZEQn8H5ks5sf2r3gaJpZM4PMidG .
kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.5", GitCommit:"894ff23729bbc0055907dd3a496afb725396adda", GitTreeState:"clean", BuildDate:"2017-03-23T16:14:43Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
I deleted the 1.5.5 version and reinstall kubecetl 1.6.0 after that it works.
@foxish @kimoonkim I build a new release on my local machine and submit a new job with it.
2017-09-08 10:02:19 INFO SparkContext:54 - Running Spark version 2.1.0-k8s-0.3.1-SNAPSHOT
2017-09-08 10:02:20 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-09-08 10:02:20 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:379)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
2017-09-08 10:02:20 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:379)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
But I didn't build new images, is there a doc about how to build the new docker images?
Only this https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html#driver--executor-images. We are working on making the building of images and pushing them easier. Tracked in https://github.com/apache-spark-on-k8s/spark/issues/485.
@foxish Every time when I submit a spark job to kubernetes I need to build a new docker image which has a jar file inside, that's such a hard way, we should figure out a easy way to simplify the process.
@rootsongjc, have you tried using the resource staging server?
@foxish No I haven't use it.
@foxish this problem solved, I think we can close the issue.
Environment
Command
I pulled the docker images and pushed to my registry.
Logs
Spark-submit logs