stackabletech / spark-k8s-operator

Operator for Apache Spark-on-Kubernetes for Stackable Data Platform
https://stackable.tech
Other
47 stars 2 forks source link

Document running Spark jobs on a Kerberised cluster #356

Closed Jimvin closed 2 months ago

Jimvin commented 4 months ago

As a Spark developer I would like to be able to run Spark jobs on a cluster where one or more services have Kerberos enabled. There are additional setup required to get a Spark job to run with Kerberos, including providing keytab, Kerberos configuration, service configuration and Spark context config.

We should document how to run an example Spark job that connects to a kerberos-enabled HDFS and Hive service.

adwk67 commented 4 months ago

A spark-job that uses the secret operator to generate keytabs for named users can be found here: https://github.com/stackabletech/hdfs-topology-provider/pull/5/files#diff-49100e41e40194b8d1f7aba8218bec678d9a9817efd5346b4609a92269721264 (this should indicate what needs to be implemented in the spark-k8s-operator)

sbernauer commented 2 months ago

Duplicate of https://github.com/stackabletech/issues/issues/530, closing this as well. This is included in the end-to-end-security demo