Closed dduportal closed 8 months ago
To keep in mind: https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
To ensure the separation between the open source version of Kubernetes and those versions that are customized by services providers [...], the open source community is requiring that all provider-specific code that currently exists in the OSS code base be removed starting with v1.26.
kubectl 1.26.9 is now used in our system (client side).
Update:
doks-public
and doks
)TODO:
Update on DigitalOcean upgrade:
Your cluster has successfully upgraded to version 1.26.9-do.0.
Your cluster has successfully upgraded to version 1.26.9-do.0.
main
branch after the upgrades
No changes. Your infrastructure matches the configuration.
node('doks && maven') { sh 'mvn -v' }
Post Mortem
The upgrade itself went fine but it had to be done through the DigitalOcean UI: the 2 terraform PRs did not show any change in their plans because of https://github.com/jenkins-infra/digitalocean/pull/148 which tells Terraform to ignore the version changes.
Rollbacking this change fails any terraform plan (as explained in the PR) due to the way how digitalocean_kubernetes_cluster and data.digitalocean_kubernetes_cluster are linked in relation with the kuberbetes providers in charge of managing CSI and admin SVCaccounts.
As such, the upgrade procedure is amended to the following workflow:
upstream/main
as a sanity check to ensure to changes or cluster destruction are plannedEKS changelogs:
TL;DR:
1.12
or later before the upgradefsGroup
option. We don't really care as only our ACP services uses persistent volume in EKS, but worth having in mindUpdate on AWS EKS upgrade:
main
branch after the upgrades - https://infra.ci.jenkins.io/job/terraform-jobs/job/aws/job/main/436/node('cik8s && maven') { sh 'mvn -v' }
and node('bom && maven') { sh 'mvn -v' }
Post Mortem
Update: AKS Upgrade plan
Current changelog notable elements for Kubernetes 1.26:
Some AKS labels are being deprecated with the Kubernetes 1.26 release. Update your AKS labels to the recommended substitutions. See more information on label deprecations and how to update your labels in the Use labels in an AKS cluster documentation. beta.kubernetes.io/arch= and beta.kubernetes.io/os= are still applied by kubelet in kubernetes code
HostProcess Containers will be GA
Two in-tree driver persistent volumes won't be supported in AKS : kubernetes.io/azure-disk, kubernetes.io/azure-file.
All AKS clusters on version 1.26+ will use the latest coreDNS version v1.10.1.
(see below)
During cluster upgrade to v1.26.0 or a later version, disk PV node affinity check will cause the upgrade to fail if there are disk PVs still using deprecated labels: failure-domain.beta.kubernetes.io/zone and failure-domain.beta.kubernetes.io/region
privatek8s
terraform plan
Post Mortem for privatek8s
MC_...
resource group in https://github.com/jenkins-infra/helpdesk/issues/3582#issuecomment-1629210833 (last Kubernetes upgrade).
publick8s
...)publick8s
operationUpdate:
privatek8s
MC_prod-privatek8s_privatek8s-emerging-ram_eastus2
(managed by the privatek8s
) has 2 objects of type Public IP
used by the cluster:
public-privatek8s
prod-public-ips
MC_prod-privatek8s_privatek8s-emerging-ram_eastus2
to this new RGpublick8s
control plane 🤔MC_prod-privatek8s_privatek8s-emerging-ram_eastus2
to this new RGkubectl -n public-nginx-ingress edit svc public-nginx-ingress-ingress-nginx-controller
) to:service.beta.kubernetes.io/azure-load-balancer-ipv4
annotation in favor of service.beta.kubernetes.io/azure-pip-name
as recommended by the documentation (introduced way after our initial IPv6 implementation)service.beta.kubernetes.io/azure-load-balancer-resource-group
annotationprivatek8s
cluster showed warning messages in its events (kubectl -n public-nginx-ingress describe svc public-nginx-ingress-ingress-nginx-controller
) about missing permissions .../join
on the Public IPprod-public-ips
, updated the public IP public-privatek8s
+ its lock resource and adding the ne role assignementTODO:
publick8
publick8s
which will consist in:
publick8s
to ensure it works on arm64publick8s
to ensure we can validate the webhook admission bumppublick8s
:MC_publick8s...
to prod-public-ips
resourcegroup (and delete manually their locks)
public-publick8s-ipv4
public-publick8s-ipv6
ldap-jenkins-io-ipv4
public-publick8s-ipv4
public-publick8s-ipv6
ldap-jenkins-io-ipv4
Post Mortem
Mandatory logo
@smerle33 I'll let you close this issue with a mandatory gif or image ;)
Previous upgrade (1.25): https://github.com/jenkins-infra/helpdesk/issues/3582
Depreciation timelines for 1.25 (justifying the upgrade to 1.26):
Task list:
[x] Upgrade kubectl within docker-helmfile
[x] Upgrade DOKS (
doks
anddoks-public
) - see https://github.com/jenkins-infra/helpdesk/issues/3683#issuecomment-1780857936 below[x] Upgrade AWS EKS (
cik8s
andeks-public
) - see https://github.com/jenkins-infra/helpdesk/issues/3683#issuecomment-1794913444[x] Upgrade AKS (
privatek8s
andpublick8s
) - https://github.com/jenkins-infra/helpdesk/issues/3683#issuecomment-1797928819