jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
17 stars 11 forks source link

Upgrade to Kubernetes 1.25 #3582

Closed dduportal closed 1 year ago

dduportal commented 1 year ago

Previous upgrade: https://github.com/jenkins-infra/helpdesk/issues/3387

github-actions[bot] commented 1 year ago

Take a look at these similar issues to see if there isn't already a response to your problem:

  1. 92% #3053
  2. 92% #2930
  3. 92% #2866
  4. 77% #2664
dduportal commented 1 year ago

Updating kubectl:

dduportal commented 1 year ago

Since we have disabled doks due to Digital Ocean outage the 21 June 2023, we are taking the opportunity to upgrade both Digital Ocean clusters to 1.25 before putting back DigitalOcean clusters back to use.

Task list for both DigitalOcean clusters

dduportal commented 1 year ago

Next step: upgrade of the AWS EKS clusters (including upgrade of components)

dduportal commented 1 year ago

AKS upgrade:

Gaoithe commented 1 year ago

There is an outage. http://get.jenkins.io/ 13:29:49 UTC Friday, 7 July 2023 .. Okay, I see you are aware of outage. Good luck, hope you can fix and recover it without extreme stress, thank you!

dduportal commented 1 year ago

All the public services should be back. We are working on finishing the 1.25 post upgrade steps and we'll publish a post-mortem next week.

dduportal commented 1 year ago

Sub-tasks left beforer closing this issue:

timja commented 1 year ago

which transitively removes the "automatic" resource group where the public_ip must be

not true, set this label: service.beta.kubernetes.io/azure-load-balancer-resource-group: myNetworkResourceGroup

https://learn.microsoft.com/en-us/azure/aks/static-ip#create-a-service-using-the-static-ip-address

lemeurherve commented 1 year ago

which transitively removes the "automatic" resource group where the public_ip must be

not true, set this label: service.beta.kubernetes.io/azure-load-balancer-resource-group: myNetworkResourceGroup

https://learn.microsoft.com/en-us/azure/aks/static-ip#create-a-service-using-the-static-ip-address

Thanks! We'll create a test IP in a resource group to check if we can safely move IPs to another resource group without recreating them, then we'll move prod IPs in a dedicated resource group (instead of the cluster node resource group) and add the label to the concerned services.

dduportal commented 1 year ago

Last step: https://github.com/jenkins-infra/helpdesk/issues/3683

dduportal commented 1 year ago

😱 Forgot the 1.25 logo:

image

Ref. https://kubernetes.io/blog/2022/08/23/kubernetes-v1-25-release/