apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Stuck on Pending: Waiting for application spark-pi to finish.. #517

Closed AnthonyWC closed 7 years ago

AnthonyWC commented 7 years ago

I am following the spark example (https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html#dependency-management) and running on latest minkube (v0.22.2); I see the pod is stuck on pending:

Any idea what I am doing wrong?

2017-10-01 16:20:32 WARN  Utils:66 - Your hostname, noctil resolves to a loopback address: 127.0.1.1; using 192.168.0.20 instead (on interface wlp59s0)
2017-10-01 16:20:32 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2017-10-01 16:20:32 INFO  LoggingPodStatusWatcherImpl:54 - State changed, new state: 
         pod name: spark-pi-1506889231581-driver
         namespace: default
         labels: spark-app-selector -> spark-e323b8b4581643f7ba16fbd558303894, spark-role -> driver
         pod uid: f9e3f1a3-a6e5-11e7-90c8-080027922e58
         creation time: 2017-10-01T20:20:32Z
         service account name: default
         volumes: default-token-30nd7
         node name: N/A
         start time: N/A
         container images: N/A
         phase: Pending
         status: []
2017-10-01 16:20:32 INFO  LoggingPodStatusWatcherImpl:54 - State changed, new state: 
         pod name: spark-pi-1506889231581-driver
         namespace: default
         labels: spark-app-selector -> spark-e323b8b4581643f7ba16fbd558303894, spark-role -> driver
         pod uid: f9e3f1a3-a6e5-11e7-90c8-080027922e58
         creation time: 2017-10-01T20:20:32Z
         service account name: default
         volumes: default-token-30nd7
         node name: N/A
         start time: N/A
         container images: N/A
         phase: Pending
         status: []
2017-10-01 16:20:32 INFO  Client:54 - Waiting for application spark-pi to finish...

kubectl get pods

NAME                                             READY     STATUS    RESTARTS   AGE
spark-pi-1506889231581-driver                    0/1       Pending   0          5m
spark-resource-staging-server-2126715582-wr52p   1/1       Running   0          7m

kubectl describe pods

Namespace:      default
Node:           <none>
Labels:         spark-app-selector=spark-e323b8b4581643f7ba16fbd558303894
                spark-role=driver
Annotations:    spark-app-name=spark-pi
Status:         Pending
IP:
Containers:
  spark-kubernetes-driver:
    Image:      kubespark/spark-driver:v2.2.0-kubernetes-0.4.0
    Port:       <none>
    Limits:
      memory:   1408Mi
    Requests:
      cpu:      1
      memory:   1Gi
    Environment:
      SPARK_DRIVER_MEMORY:      1g
      SPARK_DRIVER_CLASS:       org.apache.spark.examples.SparkPi
      SPARK_DRIVER_ARGS:
      SPARK_MOUNTED_CLASSPATH:  /usr/local/spark-2.2.0-k8s-0.4.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.4.0.jar
      SPARK_JAVA_OPT_0:         -Dspark.kubernetes.namespace=default
      SPARK_JAVA_OPT_1:         -Dspark.jars=/usr/local/spark-2.2.0-k8s-0.4.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.4.0.jar
      SPARK_JAVA_OPT_2:         -Dspark.app.name=spark-pi
      SPARK_JAVA_OPT_3:         -Dspark.submit.deployMode=cluster
      SPARK_JAVA_OPT_4:         -Dspark.driver.blockManager.port=7079
      SPARK_JAVA_OPT_5:         -Dspark.driver.bindAddress=spark-pi-1506889231581-driver-svc.default.svc.cluster.local
      SPARK_JAVA_OPT_6:         -Dspark.driver.host=spark-pi-1506889231581-driver-svc.default.svc.cluster.local
      SPARK_JAVA_OPT_7:         -Dspark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.4.0
      SPARK_JAVA_OPT_8:         -Dspark.app.id=spark-e323b8b4581643f7ba16fbd558303894
      SPARK_JAVA_OPT_9:         -Dspark.driver.port=7078
      SPARK_JAVA_OPT_10:        -Dspark.kubernetes.executor.podNamePrefix=spark-pi-1506889231581
      SPARK_JAVA_OPT_11:        -Dspark.master=k8s://https://192.168.99.100:8443
      SPARK_JAVA_OPT_12:        -Dspark.kubernetes.driver.pod.name=spark-pi-1506889231581-driver
      SPARK_JAVA_OPT_13:        -Dspark.kubernetes.resourceStagingServer.uri=http://192.168.99.100:31000
      SPARK_JAVA_OPT_14:        -Dspark.executor.instances=2
      SPARK_JAVA_OPT_15:        -Dspark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.4.0
      SPARK_JAVA_OPT_16:        -Dspark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.4.0
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-30nd7 (ro)
Conditions:
  Type          Status
  PodScheduled  False
Volumes:
  default-token-30nd7:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-30nd7
    Optional:   false
QoS Class:      Burstable
Node-Selectors: <none>
Tolerations:    <none>
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                    -------------   --------        ------                  -------
  1m            51s             8       default-scheduler                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient memory (1).

Name:           spark-resource-staging-server-2126715582-wr52p
Namespace:      default
Node:           minikube/192.168.99.100
Start Time:     Sun, 01 Oct 2017 16:18:51 -0400
Labels:         pod-template-hash=2126715582
                resource-staging-server-instance=default
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"spark-resource-staging-server-2126715582","uid":"bd489165-a6e5-1...
Status:         Running
IP:             172.17.0.5
Created By:     ReplicaSet/spark-resource-staging-server-2126715582
Controlled By:  ReplicaSet/spark-resource-staging-server-2126715582
Containers:
  spark-resource-staging-server:
    Container ID:       docker://6a8c2c5c03593110c55cfeea8d083613589af2b13164c033dd1a85e52f119e50
    Image:              kubespark/spark-resource-staging-server:v2.2.0-kubernetes-0.4.0
    Image ID:           docker-pullable://kubespark/spark-resource-staging-server@sha256:f4a87ee64e782ce476cf0c0fcd317bf8068d4ede4241b782a10075a88f946685
    Port:               <none>
    Args:
      /etc/spark-resource-staging-server/resource-staging-server.properties
    State:              Running
      Started:          Sun, 01 Oct 2017 16:18:51 -0400
    Ready:              True
    Restart Count:      0
    Limits:
      cpu:      100m
      memory:   1Gi
    Requests:
      cpu:              100m
      memory:           1Gi
    Environment:        <none>
    Mounts:
      /etc/spark-resource-staging-server from resource-staging-server-properties (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-30nd7 (ro)
Conditions:
  Type          Status
  Initialized   True
  Ready         True
  PodScheduled  True
Volumes:
  resource-staging-server-properties:
    Type:       ConfigMap (a volume populated by a ConfigMap)
    Name:       spark-resource-staging-server-config
    Optional:   false
  default-token-30nd7:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-30nd7
    Optional:   false
QoS Class:      Guaranteed
Node-Selectors: <none>
Tolerations:    <none>
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath                                   Type            Reason                   Message
  ---------     --------        -----   ----                    -------------                                   --------        ------                   -------
  3m            3m              1       default-scheduler                                                       Normal          Scheduled                Successfully assigned spark-resource-staging-server-2126715582-wr52p to minikube
  3m            3m              1       kubelet, minikube                                                       Normal          SuccessfulMountVolume    MountVolume.SetUp succeeded for volume "resource-staging-server-properties"
  3m            3m              1       kubelet, minikube                                                       Normal          SuccessfulMountVolume    MountVolume.SetUp succeeded for volume "default-token-30nd7"
  3m            3m              1       kubelet, minikube       spec.containers{spark-resource-staging-server}  Normal          Pulled                   Container image "kubespark/spark-resource-staging-server:v2.2.0-kubernetes-0.4.0" already present on machine
  3m            3m              1       kubelet, minikube       spec.containers{spark-resource-staging-server}  Normal          Created                  Created container
  3m            3m              1       kubelet, minikube       spec.containers{spark-resource-staging-server}  Normal          Started                  Started container
AnthonyWC commented 7 years ago

Getting Error: Could not find or load main class org.apache.spark.examples.SparkPi:

kubectl logs -f spark-pi-1506901697460-driver

++ id -u
+ myuid=0
++ id -g
+ mygid=0
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ /sbin/tini -s -- /bin/sh -c 'SPARK_CLASSPATH="${SPARK_HOME}/jars/*" &&     env | grep SPARK_JAVA_OPT_ | sed '\''s/[^=]*=\(.*\)/\1/g'\'' > /tmp/java_opts.txt &&     readarray -t SPARK_DRIVER_JAVA_OPTS < /tmp/java_opts.txt &&     if ! [ -z ${SPARK_MOUNTED_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_MOUNTED_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_SUBMIT_EXTRA_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_SUBMIT_EXTRA_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_EXTRA_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_EXTRA_CLASSPATH:$SPARK_CLASSPATH"; fi &&     if ! [ -z ${SPARK_MOUNTED_FILES_DIR+x} ]; then cp -R "$SPARK_MOUNTED_FILES_DIR/." .; fi &&     if ! [ -z ${SPARK_MOUNTED_FILES_FROM_SECRET_DIR} ]; then cp -R "$SPARK_MOUNTED_FILES_FROM_SECRET_DIR/." .; fi &&     ${JAVA_HOME}/bin/java "${SPARK_DRIVER_JAVA_OPTS[@]}" -cp $SPARK_CLASSPATH -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS'
Error: Could not find or load main class org.apache.spark.examples.SparkPi
AnthonyWC commented 7 years ago

Had the wrong file path for jar file (though it was local but it was the jar file within docker image.