kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.78k stars 1.38k forks source link

Pod is in error state #1219

Open AdeloreSimiloluwa opened 3 years ago

AdeloreSimiloluwa commented 3 years ago

Trying to run a spark operator, i am using the pi.py file and spark-py-pi.yaml files

import sys
from random import random
from operator import add

from pyspark.sql import SparkSession

if __name__ == "__main__":
    """
        Usage: pi [partitions]
    """
    spark = SparkSession\
        .builder\
        .appName("PythonPi")\
        .getOrCreate()

    partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2
    n = 100000 * partitions

    def f(_):
        x = random() * 2 - 1
        y = random() * 2 - 1
        return 1 if x ** 2 + y ** 2 <= 1 else 0

    count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
    print("Pi is roughly %f" % (4.0 * count / n))

    spark.stop()

and here is the yaml file as well

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: pyspark-pi
  namespace: default
spec:
  type: Python
  pythonVersion: "2"
  mode: cluster
  image: "gcr.io/spark-operator/spark-py:v3.0.0"
  imagePullPolicy: Always
  mainApplicationFile: "local:///opt/spark/examples/src/main/python/pi.py"
  sparkVersion: "3.0.0"
  restartPolicy:
    type: OnFailure
    onFailureRetries: 3
    onFailureRetryInterval: 10
    onSubmissionFailureRetries: 5
    onSubmissionFailureRetryInterval: 20
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "512m"
    labels:
      version: 3.0.0
    serviceAccount: spark
  executor:
    cores: 1
    instances: 1
    memory: "512m"
    labels:
      version: 3.0.0
    serviceAccount: spark

image

luciferksh commented 3 years ago

before the pods got error out try to check the logs "kubectl logs -f pyspark-pi-driver". i think you haven't supply the argument

AdeloreSimiloluwa commented 3 years ago

before the pods got error out try to check the logs "kubectl logs -f pyspark-pi-driver". i think you haven't supply the argument

yeah i tried to get the logs and here is the output from kubectl logs -f pyspark-pi-driver error: a container name must be specified for pod pyspark-pi-driver, choose one of: [spark-kubernetes-driver istio-proxy] or one of the init containers: [istio-init] @luciferksh

sairamankumar2 commented 3 years ago

Hi @AdeloreSimiloluwa, there seems to be some sidecar container running(istio-proxy). You have to specify the container name (-c is the option to select container) whenever you have more than one container running in a pod.

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.