Open jash2105 opened 7 months ago
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
Hey @consideRatio , can I work on this and submit a pr ? I think this would greatly benfit the community , Let me know what you think !
Hey @jash2105, thank you for investing your time in this project and JupyterHub ecosystem of open-source software!!
- Provide step-by-step instructions on setting up JupyterHub using FluxCD/ArgoCD for resource and configuration reconciliation.
:tada: I think it would be great to provide docs to complement existing docs with details that enable readers to deploy the helm chart with FluxCD or ArgoCD in dedicated sections.
I suspect it makes sense to have separate pages for FluxCD and ArgoCD, but if they require very similar where they share more content than they differ they could live on the same page.
Note that we have some past discussions of relevance about ArgoCD, for example:
lookup
function in the chart's templates, but that requires template rendering to be done with interaction against the k8s api-server - but tools like ArgoCD may do it in isolation beforehand. This was clarified in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2887#issuecomment-1254894945, where adjustments like https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2887#issuecomment-1824055403 could be needed.I'm not sure where to put the docs, but maybe under Installing JupyterHub with ArgoCD
under Setup JupyterHub
, below:
. Alternatively, a section in the administration section about adjusting the deployment to be deployed with ArgoCD instead of helm
perhaps?
- Include practical configurations for a multi-user, highly available JupyterHub environment suitable for enterprise-level deployment, especially those requiring substantial GPU resources.
I'd appreciate if you focus on for example ArgoCD and/or FluxCD initially. The GPU topic is a complicated topic, so if documentation is to improve with regards to GPU things I'd like such contribution to be isolated and focused without coupling to other pieces. This makes review effort easier and that makes PRs get merged in general.
If there are GPU related notes specific to ArgoCD, I suggest considering those separately as well as a less complicated contribution to help deploy with ArgoCD without GPU is a valuable contribution by itself.
- Offer comprehensive debugging documentation to assist teams in quickly resolving issues.
There are some general debugging docs. If there are specific ArgoCD debugging details, they can be part of an ArgoCD section - but otherwise I think we should try to build on the general debugging docs.
Btw if you write for example about ArgoCD, try to be aware about what ArgoCD is already documenting. The more we can link out to their docs to explain something, the easier the docs are to maintain long term as ArgoCD makes changes etc.
Absolutely, your overview is very thorough. Here’s my proposed timeline for the documentation process:
Initial Documentation: I plan to start with FluxCD, focusing initially on a straightforward installation guide that covers the basic setup without any custom configurations. This will include detailed steps on how to bootstrap a cluster using GitHub or GitLab with Flux, followed by a basic Helm chart installation. The goal is to establish a minimal viable setup with the necessary pods and services, along with some preliminary debugging steps.
Review and Iteration: Once the initial documentation is complete, I’ll submit it for review. Based on the feedback, I can make any necessary revisions.
Subsequent Documentation: Continuing from there, I'll create additional pull requests to gradually expand our documentation. This will include guides on customizing resources, integrating GPU support, and replicating the setup with Argo.
Does this sequence of steps fit well with our overall strategy? Please let me know if there are any adjustments you’d like me to consider or if there are specific areas you think we should prioritize.
Argocd focuses on git-ops style deployments. What do you think about having the instructions, scripts and manifests in your own repository, and linking to them from the Z2JH docs? One challenge with having all docs in a single repo is it's not possible to automatically test them, it can be a pain for people to copy and paste code, and things can therefore easily get out of sync.
What might be particularly nice in a standalone repo is to have live manifests, and perhaps you could even deploy your own Argocd cluster in GitHub CI, and deploy the Z2JH config?
Thank you @jash2105 for planning this so clearly!
I didn't expect the "boottrap" part of "bootstrap a cluster using GitHub or GitLab with Flux" - but I may misunderstood you. I expected something like "how to deploy of the jupyterhub chart with Flux" under the assumption flux is already used to deploy things into an existing cluster. Maybe a github repository is required to be setup for this, but not a cluster using Flux?
I'm trying to ensure the scope of what is to be documented is sufficiently related to deploying the jupyterhub chart, because anything introduced in this project - even if its documentation - will require long term attention in its maintenance. If we document too much beyond whats relevant to deploy the jupyterhub chart, the project takes on too much long term maintenance burden.
I realize I can't guide this so clearly because I don't know Flux or Argo, but there should be a line drawn somewhere to focus on how to deploy this chart with Flux/Argo, as compared to how to work with Flux/Argo in general.
@consideRatio, I agree with your assessment. Starting with bootstrapping a cluster might indeed be excessive and could shift the focus too heavily onto Flux or similar CD tools. Instead, I propose initiating our efforts by deploying JupyterHub using Flux. This will be covered in my first PR. Subsequent updates can introduce enhancements such as custom deployment configurations, GPU resources, and eventually ArgoCD integration. Since I haven't set up ArgoCD on my cluster yet, we can prioritize Flux in the initial phase and then explore ArgoCD later on. Does this approach sound good to you? If so, you can expect a PR from me within the next few days or the coming week!
And to answer your question , yes, we are not setting up a cluster; we will just be setting up a git repository where we store all our manifests. And if we make any changes , the cluster will automatically recognize that and make those changes to the existing deployment.
@manics, are you suggesting that the documentation could potentially cause issues? I wouldn't expect that to be the case. Also, I agree with you about storing the plain manifests in a repository, whether it's mine or another. These manifests could serve as useful references. Moreover, having custom documentation alongside referring directly to the complete manifest could streamline the process, similar to how we handle the documentation and values.yaml file when deploying with Helm.
I don't think it'll cause issues, it's more that I think from a maintainability perspective it may be easier to have a separate repo with docs, manifests, and potentially CI workflows combined.
I think it could also be easier for readers too, it's a lot easier to tightly integrate manifests and docs in their own repo since it won't be constrained by the existing docs layout. If someone wants to reproduce your steps they could just clone the repo, this isn't so practical if you have to clone the whole Z2JH repo and search through subdirectories.
@manics, I see your point about the issue requiring a fundamental restructuring of the repository. Given this, I propose continuing with the current PR. As we develop the GitOps documentation, if we formulate a plan by then, we could consider a comprehensive overhaul of the existing repositories. Does this sound like a viable approach to you?
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/3407 @consideRatio @manics , I worked on a basic install config. Expect more prs incoming with other gitops tools and more configs in the following time to come. Thanks!
In case someone needs inspiration for an argocd App definition (should work out of the box):
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: jupterhub # name of the argocd object
namespace: argocd # namespace where this manifest lives, not the app it self!
spec:
project: jupyter # argocd project
sources:
# official helm chart source, values are self hosted
- repoURL: https://jupyterhub.github.io/helm-chart/
targetRevision: 4.0.0-0.dev.git.6717.h61ab1167 # helm chart version
chart: jupyterhub
helm:
valueFiles: # supply values from some self hosted repo
- $values/jupyterhub/helm/values.yaml # path inside self hosted repo
# self referencing repo to inject values.yaml
- repoURL: 'https://github.com/org/repo.git'
targetRevision: main # git branch
ref: values
# extra yamls for additional ressources such as an ingress definition
- repoURL: 'https://github.com/org/repo.git'
path: jupyterhub/k8s # path inside repo for other resources
targetRevision: main # git branch
directory: # all yaml files inside "jupyterhub/k8s"
recurse: true
include: "{*.yaml,*.yml}"
destination:
server: 'https://kubernetes.default.svc' # kubernets cluster
namespace: jupyterhub # deployment namespace for jupyterhub
syncPolicy:
syncOptions:
- CreateNamespace=true # create destination kubernetes namespace
- ServerSideApply=true # fix for meta data annotation being too long
automated:
selfHeal: true # auto sync and repair
prune: true # delete ressources after deletion of this manifest
---
Hello JupyterHub team,
I've been exploring the current documentation and setup processes for JupyterHub on Kubernetes, primarily managed through Helm. This setup works well for basic deployments, but I've noticed a potential gap for large-scale, enterprise-grade deployments.
Many enterprise data science and engineering teams might prefer integrating JupyterHub with existing GitOps workflows, typically managed via FluxCD or ArgoCD, rather than directly using Helm for every change. This approach leverages their existing CI/CD pipelines and enhances maintainability and scalability.
Given this, I propose expanding the documentation to include detailed guidance on integrating JupyterHub with FluxCD and ArgoCD. This enhancement will:
I believe these additions will significantly streamline the setup process for large teams and institutions, reducing the overhead associated with integrating JupyterHub into large-scale infrastructure.
I am eager to contribute by drafting the documentation and configuration examples. Before proceeding, I'd like to gather feedback on this idea and any specific requirements or suggestions the community or maintainers might have.
Looking forward to your thoughts and hoping to contribute effectively to this amazing project!