Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
This it the on-going work of setting up Secure HDFS interaction with Spark-on-K8S.
The architecture is discussed in this community-wide google doc
This initiative can be broken down into 4 stages.
STAGE 1
[x] Detecting HADOOP_CONF_DIR environmental variable and using Config Maps to store all Hadoop config files locally, while also setting HADOOP_CONF_DIR locally in the driver / executors
STAGE 2
[x] Grabbing TGT from LTC or using keytabs+principle and creating a DT that will be mounted as a secret
What changes were proposed in this pull request?
This it the on-going work of setting up Secure HDFS interaction with Spark-on-K8S. The architecture is discussed in this community-wide google doc This initiative can be broken down into 4 stages.
STAGE 1
HADOOP_CONF_DIR
environmental variable and using Config Maps to store all Hadoop config files locally, while also settingHADOOP_CONF_DIR
locally in the driver / executorsSTAGE 2
TGT
fromLTC
or using keytabs+principle and creating aDT
that will be mounted as a secretSTAGE 3
STAGE 4
How was this patch tested?
Docs and Error Handling?