asascience-open / nextgen-dmac

Public repository describing the prototyping efforts and direction of the Next-Gen DMAC project, "Reaching for the Cloud: Architecting a Cloud-Native Service-Based Ecosystem for DMAC"
MIT License
18 stars 4 forks source link

Set up cloud Pangeo/nebari instance #18

Closed jonmjoyce closed 1 year ago

jonmjoyce commented 1 year ago

Create a sandbox environment in AWS following instructions from Pangeo.

rsignell-usgs commented 1 year ago

@jonmjoyce, I'm a big fan of Nebari (formerly called Qhub), open-source Infrastructure-as-Code that uses Terraform to deploy JupyterHub with Dask Gateway on Kubernetes on AWS, Google, Azure or DigitalOcean.

A few of the things that make Nebari attractive over Dask Hub:

We've been running Nebari for ESIP for the last two years at https://jupyter.qhub.esipfed.org.
I've given you access if you want to check it out! There is an introductory notebook at https://jupyter.qhub.esipfed.org/user/rsignell-usgs/lab/tree/shared/users/Welcome.ipynb

rsignell-usgs commented 1 year ago

@jonmjoyce I saw this commit: https://github.com/asascience-open/nextgen-dmac/commit/1fe08eb5a838ee9c74fbd67f0269f7d5ba24d5cf which seems to suggest you guys are still working on deploying Kubernetes?

What is the reason to do that instead of deploying kubernetes (and JupyterHub + Dask Gageway) using qhub/nebari?

jonmjoyce commented 1 year ago

Appreciate that you are keeping up-to-date with the changes on GitHub! This particular commit you referenced was from some ongoing work I was doing. Prior to our meeting I had already configured a K8s cluster and tested a Pangeo install (single user JupyterHub + dask), and I have since changed that configuration to be a dask-gateway install. Since our meeting, @cheryldmorse is setting up nebari and this ticket is now tracking that progress.

I think Kubernetes is going to be part of our stack going forward regardless. Aside from the analysis capabilities we'll get from nebari, I would like our service mesh to target K8s as a platform. This includes the data ingest workflows that @benjwadams is exploring and the data access protocols that @mpiannucci is working on. In some cases we will also want a headless dask cluster for automated processing and that's another reason for running dask-gateway.

rsignell-usgs commented 1 year ago

@jonmjoyce , I absolutely agree that Kubernetes will likely going to be part of the will be part of the stack, but Nebari is a painless way to configure, deploy and maintain your Kubernetes cluster! I know, because I did it the hard way for two years before switching to Qhub/Nebari!

And you get JupyterHub & Dask Gateway as a bonus!

@cheryldmorse please reach out if you have issues deploying Nebari !!