pangeo-data / helm-chart

Pangeo helm charts
https://pangeo-data.github.io/helm-chart/
21 stars 26 forks source link

Permissions of Jupyterhub users on Kubernetes #79

Closed zonca closed 4 years ago

zonca commented 6 years ago

I am testing the GCE deployment from https://github.com/pangeo-data/pangeo/blob/master/gce on Jetstream. BTW the live demo on https://www.youtube.com/watch?v=rSOJKbfNBNk looks awesome!

I am inspecting Jupyterhub configuration, in particular: https://github.com/pangeo-data/pangeo/blob/master/gce/jupyter-config.yaml#L62 Does this mean that each Jupyterhub user has control over Kubernetes from their pod? Could the user kill other users' pods?

mrocklin commented 6 years ago

Yes. The current system is not at all secure. We need someone to develop proper roles for users to control their permissions.

On Thu, Mar 1, 2018 at 10:06 AM, Andrea Zonca notifications@github.com wrote:

I am testing the GCE deployment from https://github.com/pangeo- data/pangeo/blob/master/gce on Jetstream. BTW the live demo on https://www.youtube.com/watch?v=rSOJKbfNBNk looks awesome!

I am inspecting Jupyterhub configuration, in particular: https://github.com/pangeo-data/pangeo/blob/master/gce/ jupyter-config.yaml#L62 Does this mean that each Jupyterhub user has control over Kubernetes from their pod? Could the user kill other users' pods?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo/issues/135, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszL_ms-ftkszoamC_in_hW5pwCGSnks5taA5agaJpZM4SYdA1 .

zonca commented 6 years ago

thanks @mrocklin I wonder if could we setup one large dask cluster and have the users all connect to that? and scale it a bit based on load

mrocklin commented 6 years ago

That is also possible. There are a few ways of doing things. Some reasons why we're choosing the current path:

  1. Dask expects workers and clients on the same cluster to have mostly the same software environment, but users in this group tend to all have different environments
  2. Dask doesn't provide any per-user controls, but Kubernetes does, so it's probably easier for us to manage things on the Kubernetes level if we ever care (which we probably will)

On Thu, Mar 1, 2018 at 10:10 AM, Andrea Zonca notifications@github.com wrote:

thanks @mrocklin https://github.com/mrocklin I wonder if could we setup one large dask cluster and have the users all connect to that? and scale it a bit based on load

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo/issues/135#issuecomment-369622115, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszAK4iTO_wGx_nPH2kLjLPUgEq78Jks5taA9ugaJpZM4SYdA1 .

mrocklin commented 6 years ago

It's worth noting that the original JADE deployment did things the way that you're suggesting.

On Thu, Mar 1, 2018 at 10:14 AM, Matthew Rocklin mrocklin@anaconda.com wrote:

That is also possible. There are a few ways of doing things. Some reasons why we're choosing the current path:

  1. Dask expects workers and clients on the same cluster to have mostly the same software environment, but users in this group tend to all have different environments
  2. Dask doesn't provide any per-user controls, but Kubernetes does, so it's probably easier for us to manage things on the Kubernetes level if we ever care (which we probably will)

On Thu, Mar 1, 2018 at 10:10 AM, Andrea Zonca notifications@github.com wrote:

thanks @mrocklin https://github.com/mrocklin I wonder if could we setup one large dask cluster and have the users all connect to that? and scale it a bit based on load

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo/issues/135#issuecomment-369622115, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszAK4iTO_wGx_nPH2kLjLPUgEq78Jks5taA9ugaJpZM4SYdA1 .

zonca commented 6 years ago

What about a Jupyterhub managed service that the user interact through API and that deploys clusters for the users? It would know about auth and run in the hub pod which is already privileged.

mrocklin commented 6 years ago

Something like that could exist. It doesn't yet though and I suspect that it would take some effort. It might have some advantages though, I'm not sure.

I suspect that short-term the easy/correct thing to do is to handle this within Kubernetes using proper roles. I believe that this is relatively striaghtforward for someone with moderate Kubernetes experience. @jacobtomlinson and I were just discussing similar things actually.

On Thu, Mar 1, 2018 at 10:48 AM, Andrea Zonca notifications@github.com wrote:

What about a Jupyterhub managed service that the user interact through API and that deploys clusters for the users? It would know about auth and run in the hub pod which is already privileged.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/pangeo/issues/135#issuecomment-369634694, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszKHES_xSR0zt-Udm4koHpKihh2-oks5taBhegaJpZM4SYdA1 .

zonca commented 6 years ago

what I am concerned about is that only Jupyterhub has knowledge about individual users, Kubernetes (if I'm not mistaken) does not in zero-to-jupyterhub. But if you find a way to implement it it would be great! please keep us posted. I will revisit this issue in some weeks.

jacobtomlinson commented 6 years ago

I think that many of your concerns could be solved with proper RBAC implementation. Right now we have to give pod create/delete permissions to your notebook which is dangerous. But I'm sure something could be done to say you can only create pods with certain images and only delete pods which you have created yourself. We could do clever stuff with namespaces where each user gets a namespace which gets cleaned up when they log out.

I have discussed with @mrocklin in the past about some kind of "Dask Hub" which similar to Jupyter Hub would be able to spawn schedulers and workers. I don't see any reason why something like this couldn't be implemented and it could use daskernetes to achieve its functionality.

We have also thought a lot about shared clusters vs individual clusters. There are pros and cons on both sides but we have gone down the path of individual clusters. As Matt says it gives you more control over your environment and you aren't waiting for other tasks in the queue from other users.

mrocklin commented 6 years ago

@jacobtomlinson do you have the interest and free time to implement some of the RBAC stuff in this deployment?

jacobtomlinson commented 6 years ago

I can definitely take a look.

It seems like you are already using some level of control as you specify cluster-role-bindings in the setup.

But then you also have RBAC disabled in the Jupyter Hub config.

mrocklin commented 6 years ago

I wouldn't ascribe any great intent to those actions. This was all done in a quick and dirty way by following Zero-to-Jupyterhub and then allowing universal access.

mrocklin commented 6 years ago

I'm also happy to push on this if people can point me in the right direction. This seems important to me.

zonca commented 6 years ago

@jacobtomlinson @mrocklin any news on this issue? I am not familiar enough with k8s to help with RBAC, I'll revisit this issue in 2/3 weeks, so if there won't be a solution by then I'll give it a try implementing a Jupyterhub managed extension.

jacobtomlinson commented 6 years ago

Things are progressing in #172 #181 and others. Hopefully when you check back things will be in a better state of affairs.

zonca commented 6 years ago

thanks @jacobtomlinson , nice also to have a helm chart! I am checking https://github.com/pangeo-data/pangeo/pull/172/files, if I understand correctly, this gives minimal permissions to access the k8s API, but still a user could kill the pods of another user, right?

jacobtomlinson commented 6 years ago

@zonca that is correct. This could be avoided by putting each user in a separate namespace. I would be keen to add that as a config option.

zonca commented 6 years ago

@jacobtomlinson I think separate namespaces would be an elegant solution for this! if you add it as a config option I can test and provide feedback

jgerardsimcock commented 6 years ago

In general, what are the implementation differences between every user sharing a single dask cluster vs each user having a dask cluster via a jupyterhub deployment?

zonca commented 6 years ago

@jacobtomlinson if you give me a little guidance on where I should make modifications, I can probably implement namespaces support myself.

@jgerardsimcock not sure, I only considered the case of each user having a dask cluster

jacobtomlinson commented 6 years ago

@jgerardsimcock we have explored both approaches. We found that there are pros/cons on both sides. Shared clusters allow you to persist data in memory and share between users, but it also means that people can hog the task queue. We are currently working with each user having their own clusters (and potentially more than one). We are using dask-kubernetes for this.

@zonca I expect the work would need to be done in kubespawner in order to provision a namespace and RBAC roles and then run the notebook container within it. The dask-kubernetes config would also need to be updated for each user to ensure the namespace and roles are correct. This is probably a substantial amount of work which I don't have time to do at the moment.

jhamman commented 4 years ago

@jacobtomlinson - once the dust has settled on the dask-gateway integration, do you think it would be reasonable to turn off (by default) the dask-kubernetes rbac? Would this be enough to sufficiently limit the permissions of individual users on the cluster?

jacobtomlinson commented 4 years ago

Yes I expect that dask-gateway will completely replace dask-kubernetes here.

jhamman commented 4 years ago

Yes I expect that dask-gateway will completely replace dask-kubernetes here.

@jacobtomlinson - can you advise on how we should deprecate the dask-kubernetes integration here? Does Helm have tooling for marking an option as deprecated?

jacobtomlinson commented 4 years ago

Not as far as I know.

I'd maybe be tempted to remove the dask-kubernetes python library and add a shim which raises a warning when imported telling users how to switch to Dask Gateway.

jhamman commented 4 years ago

closed via #133