radanalyticsio / spark-operator

Operator for managing the Spark clusters on Kubernetes and OpenShift.
Apache License 2.0
157 stars 61 forks source link

Can I spark-submit a spark application without create spark cluster? #269

Closed kevinyu98 closed 4 years ago

kevinyu98 commented 4 years ago

If I just deployed this operator on the openshift, without creating spark cluster, can I do spark-submit through command line to submit? It seems that I can do this using the SparkApplication through the OLM console, but in the command line, I got ContainerCannotRun error.

Description:

I have this simple pi python program, I created a docker image, then did spark-submit to the k8s api server, the job failed at creating the container.

Steps to reproduce:

1. deployed the operator on the openshift

oc get pods spark-operator-5454b89fb4-wpzv8 1/1 Running 0 14d

2. create a docker image for pi python program, and run locally

docker run kevinyu98/piexample1:0.6

`19/12/10 19:10:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN".

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Connecting to Spark.
Connected to Spark. Connection object is: <pyspark.sql.session.SparkSession object at 0x7fc288c1f110> Pi is roughly 3.143508 `

3. did spark-submit on the same cluster as the operator.

./bin/spark-submit --master k8s://x.xx.xxx.xxx:xxxx --deploy-mode cluster --name piexample1 --conf spark.kubernetes.container.image=kevinyu98/piexample1:0.6 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-crd-operator --conf spark.kubernetes.namespace=operators local:///pi.py

the logs Screen Shot 2019-12-10 at 11 31 49 AM

describe the failing driver pod shows

Screen Shot 2019-12-10 at 11 30 38 AM

Here is my dockerfile

Screen Shot 2019-12-10 at 11 28 57 AM

jkremser commented 4 years ago

Hello, the error you are seeing is caused (I think) by the fact that you want to use your own custom container images with the --master k8s:// scheduling mechanism. However, this assumes an entrypoint bash script to be present on that image and you don't have it there. I'd suggest checking the default images and spot the differences.