Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
We are using apache-spark-on-k8s with secure HDFS. We are also using RBAC so that we can lock down access to specific secrets in a namespace.
This is possible by creating a Role that specifically whitelists the name of the secrets it needs (and only those secrets). However given that delegation token secrets are named with a timestamp this makes whitelisting only those secrets impossible since we cannot know the name of the secret until it is created. There is also no wildcard support for resourceNames.
Hello!
We are using apache-spark-on-k8s with secure HDFS. We are also using RBAC so that we can lock down access to specific secrets in a namespace.
This is possible by creating a Role that specifically whitelists the name of the secrets it needs (and only those secrets). However given that delegation token secrets are named with a timestamp this makes whitelisting only those secrets impossible since we cannot know the name of the secret until it is created. There is also no wildcard support for resourceNames.
spark-job-1523398833926-spark.kubernetes.kerberos.delegation-token-secret-name.1523398839818
This results in us being forced to use a Role that can access any secrets in a namespace, which is insecure and potentially bad bad news.
Is there any way around this that you know of? We would be open to submitting a pull request to fix this issue if you would be interested. Thanks!