jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.54k stars 796 forks source link

Helm2 -> Kubernetes/Helm3 labeling practices transition #1867

Open consideRatio opened 3 years ago

consideRatio commented 3 years ago

In the past, Helm had best practices on how to label pods, but since then Kubernetes introduced a set of best practices which now Helm recommends. These have been around for more than a year now. It would be nice to transition to use the k8s official labeling system.

The crux is that it is a breaking change to modify these labels, at least those used as "matchLabels" for Deployment resources and are part of PodDisruptionBudgets. So, when we change these, we should change them all in one go to avoid multiple sequential breaking changes.

When we do make the breaking change, I think we need to manually delete some resources or use helm --force which in Helm 3 is very dangerous I think. So, for this change to be made, we need:

  1. The change in this Helm chart to use the new labels.
  2. Upgrade instructions
  3. We may also want to update the labels set by KubeSpawner at the same time

Transition

Old New Status Description from Helm docs
app: app.kubernetes.io/name REC This should be the app name, reflecting the entire app. Usually {{ template "name" . }} is used for this. This is used by many Kubernetes manifests, and is not Helm-specific.
chart: helm.sh/chart REC This should be the chart name and version: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}.
heritage: app.kubernetes.io/managed-by REC This should always be set to {{ .Release.Service }}. It is for finding all things managed by Helm.
release: app.kubernetes.io/instance REC This should be the {{ .Release.Name }}. It aids in differentiating between different instances of the same application.
- app.kubernetes.io/version OPT The version of the app and can be set to {{ .Chart.AppVersion }}.
component: app.kubernetes.io/component OPT This is a common label for marking the different roles that pieces may play in an application. For example, app.kubernetes.io/component: frontend.
- app.kubernetes.io/part-of OPT When multiple charts or pieces of software are used together to make one application. For example, application software and a database to produce a website. This can be set to the top level application being supported.

KubeSpawner's matchLabels equivalent

KubeSpawner has historically just looked for a single label component: singleuser-server to track resources it should work with. This should be updated though, so kubespawner tracks based on app.kubernetes.io/app, app.kubernetes.io/instance, and app.kubernetes.io/component - those combined make things unique and is the typical practice for Deployment resource's matchLabels for example.

Related

Implementation

I recently did an implementation which I think I'd like for us to do in this Helm chart. See https://github.com/yuvipanda/jupyterhub-ssh/pull/19/commits/729da98a2354796170597f48c5cd3eb55a889975 for an example of the kind of changes I want us to make.

droctothorpe commented 3 years ago

@consideRatio, how would you feel about adding all of the recommended new labels (using jupyterhub.labels) but leaving the legacy labels and selectors in place? This would ensure backwards compatibility and thereby make this a non-breaking change.

As a next step, perhaps both are present by default (to ensure backwards compatibility), but there's a values.yaml toggle to disable the old ones, something along the lines of legacyLabels: true/false.

I can take a shot at one or both (or other -- I defer to you) approaches and add it to #1866 if it helps.

consideRatio commented 3 years ago

If you influence matchLabels you make a change which can cause trouble, but just adding labels alongside not targetted by matchLabels is maybe okay - but, I'd prefer to not do an intermediary change like this unless it benefits the migration towards the new label system.

The bigger part of the work with regards to this issue is to validate an approach to take and conclude what will break and what wont break etc. Preliminary work of that includes looking at changes for various kinds of k8s resources such as Deployments / StatefulSets / DaemonSets, then looking at PodDisruptionBudgets, Services, NetworkPolicies, etc.

I think matchLabels within Deployments etc, and PodDisruptionBudgets, are the most tricky one to manage during Helm upgrade process. I don't have the knowledge to suggest an technical solution, other than to say that we must do a very good job of avoiding introducing breaking changes etc, which I think can be tricky. I bet there is things to learn from other Helm charts on how they have transitioned as well regarding this.

beaniebag commented 3 years ago

Wonder if this is related. I ran into an issue with Rancher 2.x and deploying jupyterhub using Helm v3.

Rancher was adding the label: io.cattleField.appid=jupyterhub to all the deployments in the helm chart. Namely the singleuser network policy would add io.cattleField.appid=jupyterhub as a required label for the podSelector, leading to the singleuser pod being unable to communicate with hub. The problem is this only applied to the helm chart deployments. Once the Kubespawner created a new pod, it would be missing this label on the singleuser pod.

I've raised this issue with Rancher, but for now I've fallen back to using Helm v2.

consideRatio commented 5 months ago

If we update matchSelector resources etc, we will end up needing to provide instructions in a major release that one should do kubectl delete on a few resources before the upgrade to make it not fail.

Error: UPGRADE FAILED:
cannot patch "continuous-image-puller" with kind DaemonSet: DaemonSet.apps "continuous-image-puller" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"continuous-image-puller", "app.kubernetes.io/instance":"jupyterhub", "app.kubernetes.io/name":"jupyterhub"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "hub" with kind Deployment: Deployment.apps "hub" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"hub", "app.kubernetes.io/instance":"jupyterhub", "app.kubernetes.io/name":"jupyterhub"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "autohttps" with kind Deployment: Deployment.apps "autohttps" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"autohttps", "app.kubernetes.io/instance":"jupyterhub", "app.kubernetes.io/name":"jupyterhub"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "proxy" with kind Deployment: Deployment.apps "proxy" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"proxy", "app.kubernetes.io/instance":"jupyterhub", "app.kubernetes.io/name":"jupyterhub"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "user-scheduler" with kind Deployment: Deployment.apps "user-scheduler" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"user-scheduler", "app.kubernetes.io/instance":"jupyterhub", "app.kubernetes.io/name":"jupyterhub"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "user-placeholder" with kind StatefulSet: StatefulSet.apps "user-placeholder" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'ordinals', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden
manics commented 5 months ago

Even with a two stage update (over two major versions) some people won't have updated, so we'll still need some documentation for people who skip a major release.