Closed iameskild closed 2 weeks ago
I think this is a great path forward and pretty much matches what we had before.
users might add/remove packages to an environment.yaml, apt.txt, etc. that resides in an images folder in their repo.*
I wish there were some way for us to leverage repo2docker without all the bloat since we are recreating so much of what they have. I'd like if we did some research on if there are similar projects since this would be a nice area for us to not have to maintain (possible open source contribution, repo2docker-lite?)
For me the areas of complexity that I am concerned with:
The hard steps result in a matrix of CI x container registry
and we need some way to handle this. I think what you've outlined is a good path forward but secret management and extensibility are key features needed to support this.
I wish there were some way for us to leverage repo2docker without all the bloat since we are recreating so much of what they have. I'd like if we did some research on if there are similar projects since this would be a nice area for us to not have to maintain (possible open source contribution, repo2docker-lite?)
I totally agree! I'll look around for other projects that do something similar.
The hard steps result in a matrix of
CI x container registry
and we need some way to handle this. I think what you've outlined is a good path forward but secret management and extensibility are key features needed to support this.
I agree, secret management and extensibility are definitely key features. My initial thought was to add support for the following (perhaps as a "beta" rollout):
Then when we have our secret management solution sorted out, we can add support for private registries for the main cloud providers we support (AWS ECR, Azure CR, DO CR , GCP CR).
considerations:
tagging myself @trallard to remind me to update the proposal
I have a proposal that will solve some of the parts mentioned here (but not all). I still think that customizing the base docker image is an important issue.
Right now we use a base docker image for the jupyterlab image, dask gateway image etc. Since we have conda-store running in the cluster we need to depend on it more heavily.
We should be using conda-store to be building the jupyterlab environment used when launching from jupyterhub. This would allow an easy workflow for users adding jupyterlab extensions and commands in their base path. Also making the docker image much smaller.
That said this does not solve the issue with apt packages.
On discussion today I think we are in agreement on the value of using conda-store for the conda environments.
Then we want to have https://github.com/NVIDIA/container-canary canary tests for the images. This makes the requirements on the docker images in line with https://docs.anaconda.com/free/anaconda/install/linux/
Also discussed yesterday -- we can make this a two-step process:
Perhaps a next step is to simply add some best-practices to the docs on how to build/push your own images.
I'm cleaning up the RFD's and I'm going to close this for now due to no activity, but feel free to re-open if needed!
Allow users to customize and use their own images
Summary
At present, this repo builds and pushes standard images for JupyterHub, JupyterLab and Dask-Workers. These images are the default used by all Nebari deployments.
However, many users have expressed an interest in adding customize packages (
conda
,apt
or otherwise) to their default JupyterLab image and doing so at the moment is not really feasible (at least not without a decent amount of extra leg work). To accommodate users, we have often simply resorted to adding their preferred package to these default images. This solution is not scalable.User benefit
By giving Nebari users the ability to customize these images, we greatly open up what is possible for them. This will give users further control over what packages get installed and how they use and interact with their Nebari cluster.
I have already heard from a decent number users that this would be a much-appreciated feature.
Design Proposal
Ultimately, we want to allow users to add whatever packages (and possibly other configuration changes) they want to their JupyterHub, JupyterLab, and Dask-Worker images. We also want to make this process as simple and straightforward as possible.
Users should NOT need to know:
In the
nebari
code base we already have a way of generatinggitops
andnebari-linter
workflows for GitHub-Actions and GitLab-CI (for clusters that leverageci_cd
redeployments). We currently do this by building up these workflows from basic pydantic classes that were modeled off of the JSON schema for GitHub-Actionsworkflows
and GitLab-CIpipelines
respectively.Why not do the same thing for building and pushing docker images?
With some additional work, we can render a
build-push
workflow (or pipeline) that can do just that. This proposedbuild-push
workflow would look something like:environment.yaml
,apt.txt
, etc. that resides in animages
folder in their repo.*docker/build-push-action
(or similar for GitLab-CI) to build and push images to GHCR (or similar for GitLab-CI)As I currently see it, this would require:
nebari-config.yaml
(perhaps under theci_cd
section) that can be used as a trigger to render this new workflow filebuild-push
workflow filegitops
ornebari-linter
quay.io/nebari
images
) that contains anenvironment.yaml
,apt.txt
, etc.Alternatives or approaches considered (if any)
Best practices
User impact
No user impact unless they decide to use this feature.
Unresolved questions / other considerations
There are a few other enhancements that we could make to make: