dask / dask-gateway

A multi-tenant server for securely deploying and managing Dask clusters.
https://gateway.dask.org/
BSD 3-Clause "New" or "Revised" License
136 stars 88 forks source link

Support mirroring of secrets into the gateway server namespace #250

Open jcrist opened 4 years ago

jcrist commented 4 years ago

Some users have brought up qualms about our current RBAC requirements (specifically around secrets). Currently we require:

The worry is around giving the ability to read secrets in all namespaces to pods. This affects the API Server (Note that we don't give list or watch permissions, so the name of the secret would have to be known to read it) and the Traefik proxy pods.

We don't rely on secrets for Traefik, but the implementation in Traefik gets grumpy if you disable this RBAC permission. An upstream fix would be needed, and I'm not likely to do that (if others do, please let me know and we can update our chart).

We can change the requirements for the API server though. While a dask cluster is active, the controller would ensure there was a mirrored copy of the secret in the same namespace as the API server. The API server would then only need get permissions for secrets in its own namespace. This adds some logic complications in the controller. The mirrored secrets also wouldn't have an parent objects, so their deletion would have to be handled explicitly in the controller.

I'm hesitant to make this change (as it complicates the codebase and uses more resources), but could if needed. If possible I'd like to make this optional, defaulting to the existing logic (no mirroring).

jcrist commented 4 years ago

cc @gforsyth, @dankerrigan, @mmccarty

droctothorpe commented 4 years ago

One alternative to consider is an aggregate role (which grants access to resources in multiple namespaces but not all namespaces), but it would only be applicable in a double namespace (as opposed to a dynamic namespace) deployment pattern.

I relate to your reservations, @jcrist, and I'm not sure it's worth the added complexity if other widely adopted tools, like Traefik, consider the ability to get but not list secrets in all namespaces an acceptable paradigm.

Here are examples of two other widely adopted operators that use the same model:

It seems fairly prevalent; as long as the server can't list secrets in all namespaces, we're leveraging security through obscurity.

droctothorpe commented 4 years ago

Hierarchical namespaces might be able to solve this problem.

shaunc commented 4 years ago

If we did need to work with data with strict security requirements, this could become a blocker for us, though I'd have to think through it in more detail. Of course, if we are using Dask to analyze the data anyway, then putative evil code snuck into the traefik repository would see it anyway. Also, I guess we could give all of our secrets hash codes. But it does increase the complexity of creating a secure environment.

What is wrong with (optionally) just requiring application owners to duplicate secrets in all namespaces where they are needed? That would seem like the easier path for the security conscious.

OTOH -- deploying with the helm chart, we can specify external service accounts. Perhaps all that is needed, or at least a good first step, is appropriate documentation with guidelines on what permissions are needed and why. (@droctothorpe -- great link -- nice to see hierarchical namespaces -- when I was first learning kubernetes I did tell them they needed them. :))

holzman commented 1 year ago

Alternatively, would it make sense to limit deployment of everything (api server, controller, traefik, launching of dask clusters) into a single namespace? The default behavior of the helm chart does this when gateway.backend.namespace is unspecified.

It requires changing some calls to their namespaced versions, but that's pretty straightforward.