jupyterhub / kubespawner

Kubernetes spawner for JupyterHub
https://jupyterhub-kubespawner.readthedocs.io
BSD 3-Clause "New" or "Revised" License
543 stars 304 forks source link

Support spawning to different clusters #516

Open yuvipanda opened 3 years ago

yuvipanda commented 3 years ago

Proposed change

Right now, the kubernetes pod is spawned in the same cluster as the hub pod. It would be great if we can configure it to be spawned in other remote clusters. One hub can then spawn into different cloud regions, which is very helpful when dealing with cloud datasets.

The kubernetes API can easily be accessed remotely, but the hub and proxy pod need to find a way to send traffic to the user pod. We can find ways to tunnel this traffic through without much work. My favorite way is to use kubectl port-forward, also used by my earlier expeirments with accessing dask-kubernetes remotely and now dask-kubernetes itself.

Alternative options

  1. Deploy one hub per cluster users want to spawn into. This is more complicated logistically, and for the user.
  2. Make a Service object for each pod, and expose it to the internet via a LoadBalancer. This can receive traffic from the hub and proxy pod

Who would use this feature?

Anyone interested in accssing compute near datasets stored across multiple cloud providers or regions

(Optional): Suggest a solution

yuvipanda commented 3 years ago

Had a very helpful conversation with @consideRatio about this! Since it might add additional complexity here, I think it'd be useful to start this off outside this repo, as a subclass of KubeSpawner. And then upstream what is needed, and hopefully merge them together eventually. This might necessitate refactoring here - particularly around the singleton reflectors. But all changes made here should be useful standalone.

We kinda do a version of this when we test with minikube, doing networking hacks to let the pods talk to the hub.

nreith commented 2 years ago

@yuvipanda Curious how you have progressed on this one? We have a similar need to provide a single integrated experience for our jhub users, but across multiple clusters. Jupyter Enterprise Gateway is interesting, but fundamentally a totally different architecture. They spawn pods per kernel (conda env), and don't allow custom kernels not in the whitelist, because each kernel is a single kernel image.

yuvipanda commented 2 years ago

@nreith I actually ended up building a separate spawner for this, and it works fairly well - https://github.com/yuvipanda/jupyterhub-multicluster-kubespawner.

nreith commented 2 years ago

@yuvipanda I found that. We're testing it out, and will make some merge requests and contributions in the future if we are able :-)

yuvipanda commented 2 years ago

@nreith that would be super awesome!

TiPPeX2 commented 2 years ago

@yuvipanda , Thanks for your great work, I appreciate it very much!

Currently the KubeSpawner is only able to spawn on it's own namespace(due to reflectors) Is the multicluster related to multi namespace by any means(or only clusters)?

I remember there is a configuration to give full cluster permissions to the hub allowing to create namespaces per user. But this is not the case.

I would like to have a single hub, which can spawn on multiple Kube namespaces(which are not the same as the hub) I have a FB of Kubespawner which changes how reflectors work, and added permission to each namespace I want into the Jupyterhub serviceAccount.

Was curios if in your sub-repo there is a way to implement above scenario, or if my implementation would have any use case for others so I could maybe open a PR and issue about it?.

We did it for multiple reasons:

  1. Single place for all users(instead of having a Jupyterhub per namespace)
  2. Minimal permission to the Jupyterhub, only have permissions on selected namespaces.
  3. Reflectors are only looking on spawned namespaces for events, instead of the entire cluster which is quite big.

Thanks for your time!

enolfc commented 1 year ago

Hi!

is there any activity on this area? We'd really like to have this in place for our JupyterHub and would be happy to join effort on this if there is something ready.

Thanks

nreith commented 1 year ago

We wrote a multi cluster kubespawner at my work but ultimately ended up going with a different hub per cluster. Will see if we can share if we get a chance. It's inspired by yuvipanda's other multicluster kubespawner.

On Mon, Apr 17, 2023, 9:54 AM Enol Fernández @.***> wrote:

Hi!

is there any activity on this area? We'd really like to have this in place for our JupyterHub and would be happy to join effort on this if there is something ready.

Thanks

— Reply to this email directly, view it on GitHub https://github.com/jupyterhub/kubespawner/issues/516#issuecomment-1511524192, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPP44OULV73NJWWJFQNHJDXBVKR7ANCNFSM5AHDT4GA . You are receiving this because you were mentioned.Message ID: @.***>

dhirschfeld commented 1 year ago

@nreith I actually ended up building a separate spawner for this, and it works fairly well - https://github.com/yuvipanda/jupyterhub-multicluster-kubespawner.

I came here looking for exactly this functionality so it's great to see it already exists! :heart:

I think this could be very handy for spawning servers in our different environments.

shohamyamin commented 1 week ago

That exactly what I am looking for. That will benefit us in several ways. One place for all users with different needs. If some one have made any progress with that and together we can make that works that would be awesome