kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.79k stars 1.38k forks source link

Deploy mode client ? #1406

Closed lboudard closed 4 weeks ago

lboudard commented 2 years ago

I've seen in multiple issues (see here) that spark-application/operator is not supposed to support client mode yet, even though it seems that it is available from specs definitions

Trying to set mode: client in spark-application job definition results in following error at spark-operator level:

/opt/spark/bin/spark-submit --master k8s://https://10.100.176.1:443 --deploy-mode client ...
21/11/23 14:03:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.io.IOException: Cannot run program "python3": error=2, No such file or directory
    at java.base/java.lang.ProcessBuilder.start(Unknown Source)
    at java.base/java.lang.ProcessBuilder.start(Unknown Source)
    at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:97)

Though setting mode: cluster (default) actually calls the default spark docker image entrypoint, which seems by default submit/startup with client mode: https://github.com/apache/spark/blob/master/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh

and indeed at the startup of the driver:

[spark-kubernetes-driver] + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
[spark-kubernetes-driver] + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.100.160.103 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner local:///home/myapp/app/myapp/__main__.py test-spark

So I'm not really sure if the spark job submitted this way is actually in cluster mode or in client mode?

Version info: chart-info: spark-operator-1.1.10 spark-operator version: v1beta2-1.2.3-3.1.1

Thanks!

AlexNavara commented 2 years ago

@lboudard Behind the scenes, "spark-submit" is eventually called with "--deploy-mode client" argument. This behavior is not specific to spark-operator, but to spark itself. At the user level, you use that option to control the place where the driver process will run relatively to place where 'spark-submit' script is called. If you choose 'client' - then the driver process will run at the same place where "spark-submit" is called. If you pick 'cluster' - the driver will run in a container, managed by resource allocator (e.g. k8s pod, YARN container). But even in cluster mode, if you take a look at the driver startup logs in the container, you'll see that it is "spark-submit --deploy-mode client ...". Now let's see what it means in the spark operator world. Here 'spark-submit' is called inside spark-operator container. So by setting "deploy-mode=client" here you actually force the driver process to run inside the operator container, not in a separate pod.

Concluding, the answer to your question is "your app runs in cluster mode".

Wh1isper commented 1 year ago

As far as I know Spark operator lets users do spark-submit commits using yaml and takes care of complex configuration issues.

As I answered https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1652#issuecomment-1652939810, if you want spark application on k8s, you can use client mode and configure it on k8s (spark.master, etc.) in the application. Pyspark users can use the sparglim package to build client mode applications quickly.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 4 weeks ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.