jupyterhub / team-compass

A repository for team interaction, syncing, and handling meeting notes across the JupyterHub ecosystem.
http://jupyterhub-team-compass.readthedocs.io
62 stars 33 forks source link

Deploy a Binder federation member on AWS? #501

Closed choldgraf closed 1 year ago

choldgraf commented 2 years ago

Context

In recent meetings around Binder sustainability (e.g. #430) a few times @manics has mentioned interest in running an AWS-based BinderHub. I believe that we could also get access to AWS credits for running a BinderHub federation member on AWS.

Proposal

If @manics is willing to lead deployment and operations efforts, I'd like to set up an AWS-based federation member for mybinder.org. This could distribute the load towards more cloud providers, which gives us more options for accepting credits that can power the service.

Questions to answer

MridulS commented 2 years ago

btw I can help around too with the AWS backed binderhub :)

minrk commented 2 years ago

We have terraform setup for GKE (it's not run on CI, but at least the deployment is recorded and automated so it's easier to teardown/redeploy). It may make sense to do the same for AWS. I suspect all we need is the terraform for an AKS cluster, and the rest should be a standard federation member.

manics commented 2 years ago

I'm happy to take the lead. Most of the work will be deciding between the different options (e.g. EKS EC2 vs Fargate, ECR vs self hosted container registry). I'd probably approach this by starting with some manual deploys, then ripping down everything and rebuilding with Terraform a few times till we're happy.

manics commented 2 years ago

I've ruled Fargate out, I thought you were charged for actual CPU/memory usage, but you're charged for requested CPU/memory https://github.com/aws/containers-roadmap/issues/79#issuecomment-756022692 😞

sgibson91 commented 2 years ago

In the September Team Meeting (#554 ) we discussed that Pangeo would like to contribute their AWS Binder to the federation as providing a separate, unauthenticated Binder didn't seem to have much value add. This Binder would be operated by 2i2c and would not have Dask Gateway available, as the current instantiation does. Notes from the discussion are in https://github.com/jupyterhub/team-compass/pull/567

betatim commented 2 years ago

That would be super cool! As to "what is the value of a standalone binderhub?" - I had hoped the pangeo people knew the answer to that when they set it up :D

sgibson91 commented 2 years ago

I think it was specifically "what is the value of another unauthenticated, globally accessible binderhub?". There's two things you can do there: offer something fundamentally different to what mybinder.org does (which was dask gateway, in the past. There is usually a cost associated with this though), or join the federation and increase the resources of mybinder.org (I also don't think the federation existed or was very new when Pangeo Binder was first setup). I agree that the answers to that question look very different if you have a more specific audience in mind though.

sgibson91 commented 2 years ago

FYI, as of today I got access to an AWS account being managed by @scottyhq and will start providing config to deploy a cluster and hub to that account in the mybinder.org-deploy repo soon.

It should be noted that this account only has ~$10k and has been donated as a means for prototyping while Pangeo/Columbia stand up an AWS account connected to their grant - so there will be a migration process in the future.

choldgraf commented 1 year ago

closing this as superceded by: