iobis / iwg-products

Repository for the Intersessional Working Group on data products.
1 stars 0 forks source link

Multi user hub options #4

Open silasprincipe opened 10 months ago

silasprincipe commented 10 months ago

There are different options for implementing a virtual environment to run and showcase the products. I believe the two main options are JupyterHub and BinderHub (which is built on top of JupyterHub to create environments based on repositories). To help us to start thinking on the possibilities and limitations, I'm adding here the documentation links and some general info (extracted from the websites).

JupyterHub

Summary

JupyterHub gives users access to computational environments and resources without burdening the users with installation and maintenance tasks. Users - including students, researchers, and data scientists - can get their work done in their own workspaces on shared resources which can be managed efficiently by system administrators. JupyterHub runs in the cloud or on your own hardware, and makes it possible to serve a pre-configured data science environment to any user in the world.

Links

  1. Documentation
  2. Implementation
  3. Repository

Examples

JupyterHub implementations in several universities: https://jupyterhub.readthedocs.io/en/stable/reference/gallery-jhub-deployments.html

Copernicus use of JupyterHubs for training

BinderHub

Summary

BinderHub is a kubernetes-based cloud service that allows users to share reproducible interactive computing environments from code repositories. It is the primary technology behind mybinder.org. BinderHub allows you to BUILD and REGISTER a Docker image from a Git repository, then CONNECT with JupyterHub, allowing you to create a public IP address that allows users to interact with the code and environment within a live JupyterHub instance. You can select a specific branch name, commit, or tag to serve.

Links

  1. Documentation
  2. Implementation
  3. Repository

Examples

Community examples PANGEO - note: Binder not working at this moment. MPA Europe JuliaClimate

silasprincipe commented 10 months ago

@pieterprovoost, just to start the discussion. It seems to me that having a BinderHub would be interesting in the sense that it can be easily connected to GitHub repositories.

silasprincipe commented 10 months ago

One thing I was thinking is that we could make the full export (and maybe a few other stuff) readily available to any user of the JupyterHub or BinderHub (see https://tljh.jupyter.org/en/latest/howto/content/share-data.html). That would really speed up any analysis process; if we add GeoParquet to this, then things will be even better.