2i2c-org / features

Temporary location for feature requests sent to 2i2c
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Launch user sessions in multiple cluster from a single hub #7

Open choldgraf opened 2 years ago

choldgraf commented 2 years ago

Description of problem and opportunity to address it

Problem description

When communities have datasets or resources that are spread across multiple cloud locations (across data centers, cloud providers, etc), they currently must deploy one JupyterHub per location to provide access to the cloud resources that are there. This creates a few problems:

Proposed solution We should make it possible for a single hub to launch interactive sessions in multiple cloud locations, not only on the location where a hub is running.

This would allow communities to have a single hub as a "launch pad" for other kinds of infrastructure that is out there. It would reduce the complexity of running multiple hubs at once, and is potentially a way for communities to divide up their interactive sessions across billing accounts.

Implementation guide and constraints

Tech implementation

One likely candidate to make this possible is to define a new JupyterHub Spawner that knows how to talk to other Kubernetes clusters, along with some kind of process that can live on those clusters and "listen" for requests to launch interactive sessions. Then the spawner would request a session on a remote cluster, and direct the person there.

Considerations

Driving test cases

@rabernat has need for a few hubs that are similar flavors of a Pangeo hub. These are attached to a few different pots of money. Rather than providing one hub per test case, we could use this as an opportunity to prototype a multi-cluster launcher that is described here.

Updates and ongoing work

damianavila commented 2 years ago

The filesystem issue is key and probably not easy to solve. Wondering if there is some existing abstraction as well that could interact with underlying NFS layers from the different cloud providers... in that scenario, we would have a multispawner to select the node where you want to spawn and a multistorage to select where to persist the stuff you are working on. Alternatively, we could push on previously discussed @rabernat's idea about riding without a "filesystem" and change people's filesystem-based mindset on the way (which would be the most difficult thing, IMHO).

yuvipanda commented 2 years ago

Unfortunately cross-DC NFS is not really viable for reliability, performance and security reasons :(

I think step 1 would likely just involve a per-cluster home directory. We could augment it with a shared directory that is sync'd across all the clouds, via either FUSE or something like https://rclone.org/.

I've made a release of the spawner already at https://github.com/yuvipanda/jupyterhub-multicluster-kubespawner, and am waiting for cloud credits to land up before I can do a deployment.

consideRatio commented 2 years ago

2i2c team sprint meeting notes:

choldgraf commented 2 years ago

Update: pinning this one for a bit

@yuvipanda and I just had a conversation about this work, and we agreed that it'd be best to prioritize some other development efforts first before we complete this one, especially since the LEAP hub needed to be deployed quickly enough that we just did it "the old fashioned way".

We're going to focus on these two pieces

And will re-visit this one at a later date.