terraform-google-modules / terraform-google-kubernetes-engine

Configures opinionated GKE clusters
https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google
Apache License 2.0
1.15k stars 1.17k forks source link

Unable to upgrade node pool as part of control plane upgrade #2055

Closed koushikgongireddy closed 1 month ago

koushikgongireddy commented 3 months ago

TL;DR

Unable to upgrade node pool as part of control plane upgrade

Expected behavior

When we perform control plane upgrade the node upgrade should also taken care after the control plane is upgraded

Observed behavior

Only control plane is getting upgraded and not node pool, its getting upgrades after 2-3 days and not right away

Terraform Configuration

We added auto upgrade true

variable "auto_upgrade_default_node" {
  type        = bool
  description = "Auto update default node"
  default     = true
}

The maintenance exclusions are also expired so it should go and perform the upgrade right away(we did on Aug 23th)

variable "maintenance_exclusions" {
type = list(object({ name = string, start_time = string, end_time = string, exclusion_scope = string }))
default    = [ {name = "blackout-07", end_time = "2024-07-10T17:49:32Z", start_time = "2024-07-09T17:49:32Z", exclusion_scope = "NO_MINOR_OR_NODE_UPGRADES" } ]
}

here is the module we are using

module "gke" {
  source = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster-update-variant"

  version = "30.0.0"

  project_id                  = var.project_id
  kubernetes_version          = var.kubernetes_version
  .
  .
  .
  node_pools = [
    {
      name               = "${var.network_name}-defnp01"
      version            = var.kubernetes_version
      machine_type       = var.machine_type

Terraform Version

tf - 1.5.5

Additional information

No response

koushikgongireddy commented 3 months ago

@apeabody @morgante Can you please check once and see what's missing? or by default is that the nature?

apeabody commented 3 months ago

Hi @koushikgongireddy - Thanks for reaching out!

I believe that with auto-upgrade the node pools are scheduled for upgrade based on criteria, not immediately upgraded. You can view/verify this schedule with gcloud container operations list. More details can be found at: https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades

koushikgongireddy commented 3 months ago

@apeabody So there is no way we can automatically upgrade the node pool as part of terraform after the control plane upgrade is done?

We just need to wait until GCP triggers the node upgrade? the problem here is we are not sure when they do it, it may takes 1 day or 2 days or 1 week.

Can you let us know if there is any alternate way to achieve it?

apeabody commented 3 months ago

Hi @koushikgongireddy - Control planes are compatible with nodes up to two minor versions older than the control plane[1], and node auto-upgrades ensure that your cluster's control plane and node version remain in in compliance during the lifecycle.

However, should you wish to manually configure the minimum node pool version, it can be done using terraform-google-kubernetes-engine module's min_master_version parameter or the google_container_node_pool resource's version parameter. However these will likely result in replacement rather than an upgrade, so I would suggest using gcloud[2], or the console[2] to trigger node pool upgrades.

  1. https://cloud.google.com/kubernetes-engine/docs/how-to/upgrading-a-cluster#upgrading-nodes
  2. https://cloud.google.com/kubernetes-engine/docs/how-to/upgrading-a-cluster#upgrade_nodes
koushikgongireddy commented 1 month ago

With Auto upgrade - False, we have more control over node pool upgrades from TF instead of giving control over to GCP.

Thanks for the help