apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Hadoop delegation tokens should be named in a way that enables RBAC whitelisting of secrets #626

Open harbesc opened 6 years ago

harbesc commented 6 years ago

Hello!

We are using apache-spark-on-k8s with secure HDFS. We are also using RBAC so that we can lock down access to specific secrets in a namespace.

This is possible by creating a Role that specifically whitelists the name of the secrets it needs (and only those secrets). However given that delegation token secrets are named with a timestamp this makes whitelisting only those secrets impossible since we cannot know the name of the secret until it is created. There is also no wildcard support for resourceNames.

spark-job-1523398833926-spark.kubernetes.kerberos.delegation-token-secret-name.1523398839818

This results in us being forced to use a Role that can access any secrets in a namespace, which is insecure and potentially bad bad news.

Is there any way around this that you know of? We would be open to submitting a pull request to fix this issue if you would be interested. Thanks!