astronomy-commons / genesis-jupyterhub-automator

When you need to quickly deploy a JupyterHub instance for tutorials, workshops, classes, and more.
MIT License
6 stars 2 forks source link

JupyterHub Deployment Automator

Zero-to-Jupyterhub in minutes.

Automator in action

The tools in this repository automates the process of creating an instance of JupyterHub on Digital Ocean's managed Kubernetes (a.k.a. k8s) instance. We primarily aim to simplify the deployment of one-off JupyterHubs for events -- demos, tutorials, or even classes.

This is largely an automation of the process described at zero-to-jupyterhub; we strongly recommend reading that excellent guide to understand the config files this automator generates. That said our goal is to make deployment possible even if you're not intimately familiar with Kubernetes or JupyterHub: if the defaults work for you, this code and document may be all you need.

What Do You Get?

The automator creates and deploys a JupyterHub instance with presets as follows in its default configuration:

To deploy JupyterHub you'll need:

  1. A Digital Ocean ("DO") account.
  2. Command line utilities for DO (doctl) and Kubernetes (kubectl).
  3. A domain you own, where your hub will reside (e.g., alerts.wtf if your hub is to be at hub.alerts.wtf), which must be managed by Digital Ocean's DNS service.
  4. A registered GitHub OAuth app, to represent your deployment. See here for details on how to create one.

The ./configure script included here will try to check you have all of the above, before allowing you to proceed.

Installing: Zero-to-JupyterHub in 10-30 minutes

In the example below, we assume:

Replace these with your actual domain name, host name, and e-mail.

1. Install required command line utilities

Assuming you're on a Mac and using Homebrew, installing is as simple as:

brew install doctl
brew install kubernetes-cli
brew install kubernetes-helm
brew install certbot
brew install jq

2. Create or log into your Digital Ocean account

Go to Digital Ocean, open an account, and remain logged in on the website.

Then authenticate via the command-line tools by running:

doctl auth init

This will ask you for your "Personal Access Token", an analog of your username/password when using command line tools. You create a new token at the personal access token page.

3. Purchase a domain, have Digital Ocean DNS manage it

If you don't already own a domain, purchase it from one of the many domain name registers out there. If confused about which one to choose, try namecheap.com -- we've had good experiences wth it.

Then follow Digital Ocean's instructions to transfer the DNS management to Digital Ocean.

4. Create a GitHub OAuth Application

Next, follow the instructions on GitHub to register a new "OAuth Application". In layman terms, this is how GitHub will identify your JupyterHub and know to allow users to log into it using their GitHub credentials.

The most important field in the form is the one named 'Authorization callback URL'. Make sure you set it equal to https://hub.alerts.wtf/hub/oauth_callback, where hub.alerts.wtf is will be the hostname of the JupyterHub you'll be standing up. You should use the same hostname in the 'Homepage URL' field (but with 'https://' prepended).

After you've created the app, paste the values of the generated 'Client ID' (a 20-characters string) and 'Client Secret' (a 40-character string) into a text file, one per line. Example:

$ cat github_app.secrets
ee07db3a7edbe4882f88
2ae4f74f88069d71f854bff5b7173fee524b2ca3

5. Configure your JupterHub

This repository comes with a ./configure script that automates the tedious work of generating of all the required configuration files.

Having done the prep work above, run:

./configure --provider=do \
            --hub-fqdn=hub.alerts.wtf \
            --github-oauth-creds=github_app.secrets \
            --letsencrypt-email=kathryn.janeway@uw.edu

This will generate configuration for you JupyterHub in hub.alerts.wtf/.

6. Deploy

You're now ready to deploy it by running:

cd hub.alerts.wtf
make all

If everything works out as it's supposed to, in about ~10 minutes your JupyterHub will be ready at https://hub.alerts.wtf.

If not, open an issue here, and make sure to include as much of the error messages, logs, or other relevant information.

Deleting everything

Once you're done, make sure to clean up after yourself (otherwise your cluster will keep accruing charges).

To destroy everything that was created (both JupyterHub and the Kubernetes cluster), run:

./scripts/gen-destroy

and answer 'yes' when asked to confirm.

WARNING: This is irreversible! All data residing in the deployment (e.g., new or modified notebooks) will be lost.

About Cost (good news: it's not huge!)

Deploying in the cloud for the first time can be stressful because of fears about cost. For short term-deployment, the costs can be fairly low.

Example 1: Daily cost of default deployment

The daily cost for the default deployment (3 nodes) assuming 10 active users (and using pricing as of Nov 18th, 2019):

This gives you a sense for how much you'll pay while testing/developing a deployment.

Example 2: Running a short tutorial

Running a 3-hr, 50-person, tutorial: (0.06 + 0.0015)503 = $9.225

Example 3: Running a 5-week workshop

Running a 5-day, 25-person workshop: (0.06 + 0.0015)2524*5 = $184.5

Add to these the yearly cost of purchasing a domain (typically $10-20/yr).

Costs can change (in either direction) by choosing a different node type or a different amount of per-user storage. See https://www.digitalocean.com/pricing/ for options and current pricing.

Details

Customizing your Deployment

Many customizations can be made via command-line arguments to ./configure. To discover what's available, run:

./configure --help

Configure generates configuration files in the etc/ subdirectory. Among other things, this directory contains:

You can edit and customize it as you see fit, and run make deploy to have the changes take effect.

Useful Kubernets commands

kubectl get pod --namespace $JHUB_K8S_NAMESPACE
kubectl get service --namespace $JHUB_K8S_NAMESPACE
kubectl get pvc --namespace $JHUB_K8S_NAMESPACE

Future Work