Closed costrouc closed 1 year ago
Would it be possible to adopt this workflow with qhub-hpc and terragrunt deployment. Worth thinking how this extension mechanism could support this.
Implementing an extension mechanism to Nebari seems to be a very good solution for companies/peoples that wants to easily build and distribute features on top of Nebari π
We, at NaasAI, would like to expose:
We are a company managing a JupyterHub infrastructure on top of Kubernetes and we have built features on top of JupyterLab to make it easier for users to schedule notebooks execution, share assets generated, build data products, etc ...
Integrating with Nebari seems like a good fit for us as we will be able to leverage what Nebari's team is doing while allowing us to focus on our core business.
We would like to distribute naas on top of Nebari.
For that, we must be able to customize Nebari's deployment, but also get information about Nebari's deployment to configure our resources accordingly. For example, we need to know:
In terms of infrastructure management, we think, at NaasAI, that Infrastructure as Code is a must, for multiple reasons that I won't explain here (unless you feel it would make this message clearer.), but a lot of companies would rather have an IaC solution than a manual process with a How to that an operator will need to follow (But it's not always the case, of course).
We want to be able to:
We chose to use Terragrunt to manage our new infrastructure deployment as it seems to be a good fit for most use cases.
We have 3 repositories to manage our infrastructure deployment:
infrastructure-modules
: This is the place where we create all our Terraform modules that will later be referenced/used to deploy infrastructure. (Terragrunt infrastructure-module example)_envcommon
: This one is used to have default values for our Terraform modules and also pinpoint specific terraform module versions. We have a configuration for each terraform module. (Terragrunt _envcommon example)infrastructure-live
: This one is used to reference configuration in _encommon
and organize deployments dependencies and environments. We have the following logic where we split by environment/cloud_provider/region/project. (Terragrunt infrastructure-live example)To integrate with Nebari we tested multiple solutions but to be able to use our Terragrunt infrastructure we chose to create a Terraform Module allowing us to:
This works for us today but has several drawbacks:
S3 bucket
, S3 prefix
etc. Even though this is deterministic, it's adding a step.A big strength of the Nebari CLI, that we think is worth keeping, no matter what is done next, is the fact that it's very easy to deploy a Nebari infrastructure. You just need to have to run npm install nebari
and follow the 2-step deployment to, configure your deployment and then actually deploy infrastructure.
This will fit a lot of user needs.
On the other hand, there is a need to help users/companies that wants to deploy Nebari and customize it using IaC to do it.
Regarding the fact that, as of today, Nebari CLI is a wrapper around Terraform, to order and split deployments into multiple stages, we could definitely have a parameter to maybe not actually deploy Nebari infrastructure using the CLI but rather use it to template/render Terragrunt configuration for example (Terragrunt being one of many templating outputs that we could have, Terraform, Pulumi, AWS CDK and so on).
To give an example, let's say I want to deploy Nebari in our Naas Terragrunt infrastructure. Let's assume that we need to deploy in a dev
environment, on aws
in us-west-1
.
I would like to be able to do:
# Create the needed folder structure to match our way to split deployment.
mkdir -p infrastructure-live/dev/aws/us-west-1/nebari
# Go to Nebari folder for dev/aws/us-west-1 deployment.
cd infrastructure-live/dev/aws/us-west-1/nebari
# Install Nebari CLI
conda install nebari -c conda-forge
# Configure my deployment
nebari init --guided-init
# Generate Terragrunt configuration (Usually it would be the time to run "nebari deploy -c nebari-config.yaml")
nebari generate -c nebari-config.yaml --template-to=Terragrunt
Then if I were to execute the tree
command in the infrastructure-live
directory I would get the following output:
.
βββ dev
βββ aws
βββ us-west-1
βββ nebari
βββ infrastructure
βΒ Β βββ terragrunt.hcl
βββ kubernetes-ingress
βΒ Β βββ terragrunt.hcl
βββ kubernetes-initialize
βΒ Β βββ terragrunt.hcl
βββ kubernetes-keycloak
βΒ Β βββ terragrunt.hcl
βββ kubernetes-keycloak-configuration
βΒ Β βββ terragrunt.hcl
βββ kubernetes-services
βΒ Β βββ terragrunt.hcl
βββ nebari-config.yaml
βββ nebari-tf-extensions
βββ terragrunt.hcl
12 directories, 8 files
Now it would be up to me to deploy the infrastructure by running for example:
cd dev/aws/us-west-1/nebari
aws-vault exec <myprofile> -- terragrunt run-all apply
So now what happens with pluggy extensions?
This would allow us to build our own nebari-naas
pluggy extension and use it like so:
cd dev/aws/us-west-1/nebari
pip install nebari-naas
nebari extensions nebari-naas generate -c nebari-config.yaml --template-to=Terragrunt
This would then have generate a new directory with it's Terragrunt configuration that could have Nebari previous stages as dependencies.
Example:
.
βββ dev
βββ aws
βββ us-west-1
βββ nebari
βββ infrastructure
βΒ Β βββ terragrunt.hcl
βββ kubernetes-ingress
βΒ Β βββ terragrunt.hcl
βββ kubernetes-initialize
βΒ Β βββ terragrunt.hcl
βββ kubernetes-keycloak
βΒ Β βββ terragrunt.hcl
βββ kubernetes-keycloak-configuration
βΒ Β βββ terragrunt.hcl
βββ kubernetes-services
βΒ Β βββ terragrunt.hcl
βββ nebari-config.yaml
βββ nebari-naas π
βΒ Β βββ terragrunt.hcl π
βββ nebari-tf-extensions
βββ terragrunt.hcl
13 directories, 9 files
Then if we want to have multiple naas extensions as well we could distribute:
pip install nebari-naas
pip install nebari-naas-extension-aaa
pip install nebari-naas-extension-bbb
pip install nebari-naas-extension-ccc
Here I mainly talked about the fact to generate Terragrunt configuration, but I think that this should just be an additional way of being able to deploy Nebari and extensions. Nebari and extensions should also be able to be deployed solely using the Nebari CLI.
I tried to give as much information as possible, but please, if something is not clear enough or you feel that it needs more explanation, tell me and I will try to make it clearer.
I am looking forward to having the possibility to deploy Nebari in a very modular way while still complying with most users/companies needs in terms of deployment strategy. I think that this extension mechanism can be a very strategic move for the adoption of Nebari and the growth of its ecosystem.
This RFD is accepted, unanimously. :)
Title
Extension Mechanism for Nebari
Summary
Over the past 3 years we have consistently run into the issue where extending and customizing Nebari has been a hard task. Several approaches have been added:
terraform_overrides
andhelm_overrides
keyword to allow for arbitrary overrides of helm values.helm_extensions
in stage 8 which allow for the addition of arbitrary helm chartstf_extensions
which integrate oauth2 and ingress to deploy a single docker imageDespite these features we still have needs from users and we are not addressing them all. Additionally we have issues when we want to add a new services it typically has to be directly added to the core of Nebari. We want to solve this by making extensions first class in Nebari.
User benefit
I see quite a few benifits from this proposal:
Design Proposal
Overall I propose we adopt pluggy. Pluggy has been adopted by many major projects including: datasette, conda, (TODO list more). Pluggy would allow us to expose a plugin interface and "install" extensions via setuptools entrypoints. Making extension installation as easy as
pip install ...
Usage from a high level user standpoint
Once a user installs the extensions we can view the installed extensions via:
Plugin Interfaces
Within nebari we will expose several plugins:
Subcommands
A plugin interface for arbitrary additional
typer
commands. All commands will be passed in the nebari config along with all specified command line arguments from the user. Conda has a similar approach with typer for their system.Stages
Nebari will use pluggy within its core and separate each stage into a pluggy
Stage
. Each stage will keep it's original name.Alternatives or approaches considered (if any)
As far as plugin/extension systems go I am only aware of two major ones within the python ecosystem:
Best practices
This will encourage the practice of extending nebari via extensions instead of direct PRs to the core.
User impact
It is possible to make this transition seamless to the user without changing behavior.
Unresolved questions
I feel confident in the approach since I have seen other project use pluggy succefully for similar work.