hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.33k stars 1.73k forks source link

GKE autoscaling does not create nodes in node pool without node_count defined #11594

Open densnoigaskogen opened 2 years ago

densnoigaskogen commented 2 years ago

Community Note

Terraform Version

Terraform version: 1.1.2

Affected Resource(s)

Terraform Configuration Files

#resource "google_container_node_pool" "primary_preemptible_nodes_1" {
  provider = google-beta
  name     = var.gke_cluster_node_pool_name_1

  #Forces terraform to wait for cluster build to be complete before building this node pool
  depends_on = [google_container_cluster.primary]

  location          = var.default_region
  cluster           = var.gke_cluster_name
  max_pods_per_node = var.gke_pool_max_pods_per_node
  node_count        = 1  **### #this argument should not be needed when using autoscaling, but it's required, othewise NO node is created when using autoscaling, it could be a bug
  #As per terraform registry, "node_count" should not be used alongside autoscaling.**

autoscaling {
    min_node_count = 1
    max_node_count = 5

  }

  node_config {
    preemptible  = true
    machine_type = "n1-standard-8"
    tags         = [var.gke_cluster_name]

    metadata = {
      disable-legacy-endpoints = "true"
    }

    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/trace.append",
      "https://www.googleapis.com/auth/bigquery",
      "https://www.googleapis.com/auth/pubsub",
      "https://www.googleapis.com/auth/servicecontrol"
    ]
  }
}

Debug Output

Panic Output

Expected Behavior

As per https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool#nested_autoscaling node_count should not be used alongside autoscaling.
autoscaling { min_node_count = 1 max_node_count = 5

} It should be 1 node in the node pool when the cluster is being created

Actual Behavior

However, the cluster was created without any nodes. node_count = 0. I had to add node_count =1 along side autoscaling, which contradicts with the documentation in the link above. Is it a bug? or incorrect documentation.

megan07 commented 2 years ago

Hi @densnoigaskogen, I'm sorry that you're running into this issue. Would you mind sharing your cluster config as well, please?

seehausen commented 2 years ago

I'm facing the same problem, node pool was created, but no node was created, I had to set min_node = 1 manually in the console


resource "google_container_node_pool" "node-pool" {
  provider = google-beta
  project  = module.project.project.project_id
  name     = "regular-pool"
  cluster  = google_container_cluster.cluster.id

  autoscaling {
    min_node_count = 1
    max_node_count = 5
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }

  node_config {
    spot         = false
    machine_type = var.k8s_machine_type
    image_type   = "COS_CONTAINERD"

    shielded_instance_config {
      enable_secure_boot = true
    }

    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_write",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/cloud-platform",
    ]
  }
}
resource "google_container_cluster" "cluster" {
  provider = google-beta

  name                      = var.k8s_cluster_name
  description               = "${var.env} cluster"
  project                   = module.project.project.project_id
  location                  = var.k8s_location
  network                   = module.network.network.self_link
  subnetwork                = module.network.subnetworks.subnet.self_link
  enable_kubernetes_alpha   = false
  remove_default_node_pool  = true
  initial_node_count        = 1
  default_max_pods_per_node = 110

  min_master_version = var.k8s_min_cluster_version
  release_channel {
    channel = var.k8s_release_channel
  }

  vertical_pod_autoscaling {
    enabled = true
  }

  private_cluster_config {
    enable_private_endpoint = false
    enable_private_nodes    = true
    master_ipv4_cidr_block  = var.private_cluster_master_ipv4_range
  }

  maintenance_policy {
    daily_maintenance_window {
      start_time = "01:30"
    }
  }

  networking_mode = "VPC_NATIVE"

  # block required to enable VPC-native
  ip_allocation_policy {
  }

  addons_config {
    gce_persistent_disk_csi_driver_config {
      enabled = true
    }
  }

  lifecycle {
    ignore_changes = [initial_node_count]
  }

  workload_identity_config {
    workload_pool = "${module.project.project.project_id}.svc.id.goog"
  }
}

image image

megan07 commented 2 years ago

Hi @seehausen, would you mind sharing your cluster config as well, please?

seehausen commented 2 years ago

Hi @seehausen, would you mind sharing your cluster config as well, please?

added config, is that the config you are looking for?

densnoigaskogen commented 2 years ago

hi @megan07 , thanks for following up this. My apology for neglecting this thread since I opened it. Yeah, my cluster config is similar to @seehausen , would this initial_node_count =1 the cause? image

kalanithiM commented 2 years ago

I am also facing the same Issue with node not getting attached to GKE node pool when auto-scaling is set to true. One thing which I don't understand is, Nodes are getting created and I can see that in GCP Compute Engine VM Instance Console Page but its not getting attached to node pool. is there anything I missing out ? Can someone please help me on resolving the above Issue.

koslib commented 1 year ago

I was scratching my head wondering why an autoscaling nodepool did not scale up to the minimum amount of nodes defined. Then I found this issue!