apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Run examples SparkPi appear an Error :Executor cannot find driver pod #565

Open sugangsky opened 6 years ago

sugangsky commented 6 years ago

This my script:

bin/spark-submit \
  --deploy-mode cluster \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://192.168.145.101:6443 \
  --kubernetes-namespace default \
  --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
  --conf spark.executor.instances=1 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.resourceStagingServer.uri=http://192.168.3.82:10000 \
  ./examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar

The spark-pi pod log:

++ id -u
+ myuid=0
++ id -g
+ mygid=0
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ /sbin/tini -s -- /bin/sh -c 'SPARK_CLASSPATH="${SPARK_HOME}/jars/*" &&     env | grep SPARK_JAVA_OPT_ | sed '\''s/[^=]*=\(.*\)/\1/g'\'' > /tmp/java_opts.txt &&     readarray -t SPARK_DRIVER_JAVA_OPTS < /tmp/java_opts.txt &&     if ! [ -z ${SPARK_MOUNTED_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_MOUNTED_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_SUBMIT_EXTRA_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_SUBMIT_EXTRA_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_EXTRA_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_EXTRA_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_MOUNTED_FILES_DIR+x} ]; then cp -R "$SPARK_MOUNTED_FILES_DIR/." .; fi &&     if ! [ -z ${SPARK_MOUNTED_FILES_FROM_SECRET_DIR} ]; then cp -R "$SPARK_MOUNTED_FILES_FROM_SECRET_DIR/." .; fi &&     ${JAVA_HOME}/bin/java "${SPARK_DRIVER_JAVA_OPTS[@]}" -cp $SPARK_CLASSPATH -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS'
2017-11-29 07:09:30 INFO  SparkContext:54 - Running Spark version 2.2.0-k8s-0.5.0
2017-11-29 07:09:31 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-11-29 07:09:31 WARN  SparkConf:66 - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
2017-11-29 07:09:31 INFO  SparkContext:54 - Submitted application: Spark Pi
2017-11-29 07:09:31 INFO  SecurityManager:54 - Changing view acls to: root
2017-11-29 07:09:31 INFO  SecurityManager:54 - Changing modify acls to: root
2017-11-29 07:09:31 INFO  SecurityManager:54 - Changing view acls groups to: 
2017-11-29 07:09:31 INFO  SecurityManager:54 - Changing modify acls groups to: 
2017-11-29 07:09:31 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2017-11-29 07:09:32 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 7078.
2017-11-29 07:09:32 INFO  SparkEnv:54 - Registering MapOutputTracker
2017-11-29 07:09:32 INFO  SparkEnv:54 - Registering BlockManagerMaster
2017-11-29 07:09:32 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2017-11-29 07:09:32 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2017-11-29 07:09:32 INFO  DiskBlockManager:54 - Created local directory at /mnt/tmp/spark-local/spark-965a6bdf-bdfc-403f-8b73-cd4db3153f1e/blockmgr-7348dbf4-b726-4ddb-8cac-a0349a397f31
2017-11-29 07:09:32 INFO  MemoryStore:54 - MemoryStore started with capacity 413.9 MB
2017-11-29 07:09:32 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2017-11-29 07:09:32 INFO  log:192 - Logging initialized @2762ms
2017-11-29 07:09:32 INFO  Server:345 - jetty-9.3.z-SNAPSHOT
2017-11-29 07:09:32 INFO  Server:403 - Started @2883ms
2017-11-29 07:09:32 INFO  AbstractConnector:270 - Started ServerConnector@533b266e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2017-11-29 07:09:32 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@f9879ac{/jobs,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3a4e343{/jobs/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62dae245{/jobs/job,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6c6357f9{/jobs/job/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3954d008{/stages,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@593e824f{/stages/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6d8792db{/stages/stage,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@ce5a68e{/stages/stage/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2f162cc0{/stages/pool,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7c041b41{/stages/pool/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@61078690{/storage,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@403132fc{/storage/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2cab9998{/storage/rdd,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@669513d8{/storage/rdd/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4a8a60bc{/environment,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7859e786{/environment/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@314b8f2d{/executors,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5118388b{/executors/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7876d598{/executors/threadDump,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5af28b27{/executors/threadDump/json,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4985cbcb{/static,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33617539{/,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5db4c359{/api,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4f8969b0{/jobs/job/kill,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@192f2f27{/stages/stage/kill,null,AVAILABLE,@Spark}
2017-11-29 07:09:32 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://spark-pi-1511939352729-driver-svc.default.svc.cluster.local:4040
2017-11-29 07:09:32 INFO  SparkContext:54 - Added JAR /var/spark-data/spark-jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar at spark://spark-pi-1511939352729-driver-svc.default.svc.cluster.local:7078/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar with timestamp 1511939372933
2017-11-29 07:09:43 ERROR KubernetesClusterSchedulerBackend:91 - Executor cannot find driver pod.
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [Pod]  with name: [spark-pi-1511939352729-driver]  in namespace: [default]  failed.
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:74)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:72)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:142)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2764)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at okhttp3.internal.platform.Platform.connectSocket(Platform.java:124)
    at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:223)
    at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149)
    at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
    at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
    at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
    at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
    at okhttp3.RealCall.execute(RealCall.java:69)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
    ... 13 more
2017-11-29 07:09:43 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Executor cannot find driver pod
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:78)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:72)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:142)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2764)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [Pod]  with name: [spark-pi-1511939352729-driver]  in namespace: [default]  failed.
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:74)
    ... 11 more
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at okhttp3.internal.platform.Platform.connectSocket(Platform.java:124)
    at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:223)
    at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149)
    at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
    at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
    at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
    at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
    at okhttp3.RealCall.execute(RealCall.java:69)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
    ... 13 more
2017-11-29 07:09:43 INFO  AbstractConnector:310 - Stopped Spark@533b266e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2017-11-29 07:09:44 INFO  SparkUI:54 - Stopped Spark web UI at http://spark-pi-1511939352729-driver-svc.default.svc.cluster.local:4040
2017-11-29 07:09:44 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2017-11-29 07:09:44 INFO  MemoryStore:54 - MemoryStore cleared
2017-11-29 07:09:44 INFO  BlockManager:54 - BlockManager stopped
2017-11-29 07:09:44 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2017-11-29 07:09:44 WARN  MetricsSystem:66 - Stopping a MetricsSystem that is not running
2017-11-29 07:09:44 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2017-11-29 07:09:44 INFO  SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: Executor cannot find driver pod
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:78)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:72)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:142)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2764)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [Pod]  with name: [spark-pi-1511939352729-driver]  in namespace: [default]  failed.
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.liftedTree1$1(KubernetesClusterSchedulerBackend.scala:74)
    ... 11 more
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at okhttp3.internal.platform.Platform.connectSocket(Platform.java:124)
    at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:223)
    at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149)
    at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195)
    at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
    at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
    at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
    at okhttp3.RealCall.execute(RealCall.java:69)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
    ... 13 more
2017-11-29 07:09:44 INFO  ShutdownHookManager:54 - Shutdown hook called
2017-11-29 07:09:44 INFO  ShutdownHookManager:54 - Deleting directory /mnt/tmp/spark-local/spark-965a6bdf-bdfc-403f-8b73-cd4db3153f1e/spark-a69d177c-066b-42bf-bb20-7fa0a2012845

my kubectl version:

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T05:28:34Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T05:17:43Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

How can I fix this error?

sugangsky commented 6 years ago

my Apache spark on k8s version is “spark-2.2.0-k8s-0.5.0-bin-with-hadoop-2.7.3.tgz“

liyinan926 commented 6 years ago

First of all, I assume you have the custom service account named spark in the default namespace of your cluster since you specified that in the command. Are you using minikube? Did you see something like the following on the submission client side?

No scheme specified for kubernetes master URL, so defaulting to https. ...
sugangsky commented 6 years ago

no,I am using kubeadm.