apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Add support for fetching application dependencies from HDFS #584

Closed hex108 closed 5 years ago

hex108 commented 6 years ago

To add support for fetching application dependencies from HDFS, we need to mount HADOOP_CONF_DIR to init container.

Usage example:

$ export HADOOP_CONF_DIR=pwd/hadoopconf

$ bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi 
--master k8s://http://localhost:8080 --kubernetes-namespace default
--conf spark.executor.instances=5 
--conf spark.app.name=spark-pi
--conf spark.kubernetes.driver.docker.image=jungong/spark-driver:hdfs
--conf spark.kubernetes.executor.docker.image=jungong/spark-executor:hdfs
--conf spark.kubernetes.initcontainer.docker.image=jungong/spark-init:hdfs
--conf spark.kubernetes.initcontainer.inannotation=true
--conf spark.kubernetes.docker.image.pullPolicy=Always
hdfs://hdfsCluster/spark/spark-examples_2.11-2.2.0-k8s-0.5.0.jar