Closed nprokopic closed 3 months ago
We want to be able to create, test and deliver releases quickly and efficiently, and that each team can own the entire workflow, from the development to testing to releasing.
Except from the RFC:
We deploy workload clusters by gluing together multiple components. Some of those are:
- Provider-independent Cluster API resources (e.g. Cluster, MachineDeployment/MachinePool, etc.),
- Provider-specific Cluster API resources (e.g. AWSCluster, AzureCluster, VSphereCluster, VCDCluster),
- CPI implementation (aka provider-specific cloud controller manager),
- CNI (e.g. Cilium),
- CSI,
- upstream apps that we package,
- our apps that we develop and package,
- provider-independent and provider-specific default configuration of apps,
- configuration of the operating system and different node components, such as systemd, containerd, etc.
Multiple teams and multiple people are continuously working on all the above and it is indispensable to ensure that all of them have smooth and frictionless development, testing and release experience, so we can increase deployment frequency, reduce lead time for changes and reduce change failure rate. For this to work, we need to be able to develop, test and release almost every change independently of almost all other changes. The team that have worked on a change should be able to release the change fully independently, without any intervention from the provider-integration or provider-independent KaaS teams.
Here is how the releases look like https://github.com/giantswarm/releases/pull/1262.
They immutable like in vintage. Every new version of any app/components requires a new release.
apiVersion: release.giantswarm.io/v1alpha1
kind: Release
metadata:
name: v25.0.0-alpha.1
spec:
apps:
- name: aws-ebs-csi-driver
version: 2.30.1
dependsOn:
- cloud-provider-aws
- name: aws-pod-identity-webhook
version: 1.14.2
dependsOn:
- cert-manager
- name: capi-node-labeler
version: 0.5.0
- name: cert-exporter
version: 2.9.0
dependsOn:
- kyverno
- name: cert-manager
version: 3.7.5
dependsOn:
- prometheus-operator-crd
- name: chart-operator-extensions
version: 1.1.2
dependsOn:
- prometheus-operator-crd
- name: cilium
version: 0.24.0
- name: cilium-crossplane-resources
version: 0.1.0
- name: cilium-servicemonitors
version: 0.1.2
dependsOn:
- prometheus-operator-crd
- name: cloud-provider-aws
version: 1.25.14-gs2
dependsOn:
- vertical-pod-autoscaler-crd
- name: cluster-autoscaler
version: 1.27.3-gs9
dependsOn:
- kyverno
- name: coredns
version: 1.21.0
- name: external-dns
version: 3.1.0
dependsOn:
- prometheus-operator-crd
- name: metrics-server
version: 2.4.2
dependsOn:
- kyverno
- name: net-exporter
version: 1.19.0
dependsOn:
- prometheus-operator-crd
- name: network-policies
version: 0.1.0
catalog: cluster
- name: node-exporter
version: 1.19.0
dependsOn:
- kyverno
- name: vertical-pod-autoscaler
version: 5.1.0
dependsOn:
- prometheus-operator-crd
- name: vertical-pod-autoscaler-crd
version: 3.0.0
- name: etcd-k8s-res-count-exporter
version: 1.10.0
dependsOn:
- kyverno
- name: observability-bundle
version: 1.3.4
dependsOn:
- coredns
- name: k8s-dns-node-cache
version: 2.6.1
dependsOn:
- kyverno
- name: security-bundle
version: 1.6.5
catalog: giantswarm
dependsOn:
- prometheus-operator-crd
- name: teleport-kube-agent
version: 0.9.0
components:
- name: cluster-aws
version: 0.76.1-b76af2c26f4224ffb0d718e940e232fac05c89a0
- name: flatcar
version: 3815.2.2
- name: flatcar-variant
version: 1.0.0
- name: kubernetes
version: 1.25.16
date: "2024-05-18T12:57:50Z"
state: active
apiVersion: release.giantswarm.io/v1alpha1
kind: Release
metadata:
name: v26.0.0-alpha.1
spec:
apps:
- name: aws-ebs-csi-driver
version: 2.30.1
dependsOn:
- cloud-provider-aws
- name: aws-pod-identity-webhook
version: 1.14.2
dependsOn:
- cert-manager
- name: capi-node-labeler
version: 0.5.0
- name: cert-exporter
version: 2.9.0
dependsOn:
- kyverno
- name: cert-manager
version: 3.7.5
dependsOn:
- prometheus-operator-crd
- name: chart-operator-extensions
version: 1.1.2
dependsOn:
- prometheus-operator-crd
- name: cilium
version: 0.24.0
- name: cilium-crossplane-resources
version: 0.1.0
- name: cilium-servicemonitors
version: 0.1.2
dependsOn:
- prometheus-operator-crd
- name: cloud-provider-aws
version: 1.26.11-gs.alpha.1
dependsOn:
- vertical-pod-autoscaler-crd
- name: cluster-autoscaler
version: 1.27.3-gs9
dependsOn:
- kyverno
- name: coredns
version: 1.21.0
- name: external-dns
version: 3.1.0
dependsOn:
- prometheus-operator-crd
- name: metrics-server
version: 2.4.2
dependsOn:
- kyverno
- name: net-exporter
version: 1.19.0
dependsOn:
- prometheus-operator-crd
- name: network-policies
version: 0.1.0
catalog: cluster
- name: node-exporter
version: 1.19.0
dependsOn:
- kyverno
- name: vertical-pod-autoscaler
version: 5.2.2
dependsOn:
- prometheus-operator-crd
- name: vertical-pod-autoscaler-crd
version: 3.1.0
- name: etcd-k8s-res-count-exporter
version: 1.10.0
dependsOn:
- kyverno
- name: observability-bundle
version: 1.3.4
dependsOn:
- coredns
- name: k8s-dns-node-cache
version: 2.6.2
dependsOn:
- kyverno
- name: security-bundle
version: 1.6.5
catalog: giantswarm
dependsOn:
- prometheus-operator-crd
- name: teleport-kube-agent
version: 0.9.0
components:
- name: cluster-aws
version: 0.76.1-b76af2c26f4224ffb0d718e940e232fac05c89a0
- name: flatcar
version: 3815.2.2
- name: flatcar-variant
version: 1.0.0
- name: kubernetes
version: 1.26.15
date: "2024-05-18T12:57:50Z"
state: active
Releases are added to the releases repo via PRs. After the PRs are merged, CI job is pushing the releases to provider-specific app collections, so they end up on the MCs and we can see them there:
kubectl get release
NAME KUBERNETES VERSION STATE AGE READY INUSE
v25.0.0-alpha.1 1.25.16 active 10d
v26.0.0-alpha.1 1.26.15 active 10d
v27.0.0-alpha.1 1.27.14 active 10d
cluster-$provider app manifest looks like this:
---
apiVersion: v1
data:
values: |
global:
release:
version: 25.0.0-alpha.1
connectivity:
availabilityZoneUsageLimit: 3
metadata:
description: Releases POC v25
name: v25nik02
organization: nikola
annotations:
alpha.giantswarm.io/ignore-cluster-deletion: "true"
nodePools:
nodepool0:
instanceType: m5.xlarge
maxSize: 10
minSize: 3
rootVolumeSizeGB: 8
kind: ConfigMap
metadata:
creationTimestamp: null
labels:
giantswarm.io/cluster: v25nik02
name: v25nik02-userconfig
namespace: org-nikola
---
apiVersion: application.giantswarm.io/v1alpha1
kind: App
metadata:
labels:
app-operator.giantswarm.io/version: 0.0.0
name: v25nik02
namespace: org-nikola
spec:
catalog: cluster-test
kubeConfig:
inCluster: true
name: cluster-aws
namespace: org-nikola
userConfig:
configMap:
name: v25nik02-userconfig
namespace: org-nikola
version: ""
Release version is set in user Helm values, see this part in the ConfigMap above:
apiVersion: v1
data:
values: |
global:
release:
version: 25.0.0-alpha.1
# ...
App version is left empty:
apiVersion: application.giantswarm.io/v1alpha1
kind: App
# ...
spec:
version: ""
Cluster upgrade is done by updating release version, e.g. from this:
---
apiVersion: v1
data:
values: |
global:
release:
version: 25.0.0-alpha.1
# ...
To this:
---
apiVersion: v1
data:
values: |
global:
release:
version: 26.0.0-alpha.1
# ...
This will trigger cluster-$app re-rendering and new versions from the new release will get applied.
Upgrade is done by removing App.spec.version
and setting release version in Helm values.
So change from this:
apiVersion: application.giantswarm.io/v1alpha1
kind: App
# ...
spec:
version: 0.76.1
to this:
---
apiVersion: v1
data:
values: |
global:
release:
version: 25.0.0-alpha.1
# ...
apiVersion: application.giantswarm.io/v1alpha1
kind: App
# ...
spec:
version: ""
Closing this on favour of https://github.com/giantswarm/roadmap/issues/3473 and https://github.com/giantswarm/roadmap/issues/3475