hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

error while applying taints to gke node pool. #16606

Open adviteey1 opened 10 months ago

adviteey1 commented 10 months ago

Hi,

I'm facing issue with taints while upgrading google provider version from 4.62.1 to 5.0.0.

Following is the terraform code,

resource "google_container_node_pool" "taint_nodes" {
  for_each   = local.taint_nodes
  name       = each.key
  location   = var.cluster_location
  cluster    = google_container_cluster.insight-cluster.name
  initial_node_count = each.value["min_node_count"]
  node_locations = each.value["node_locations"]
  max_pods_per_node = each.value["max_pods_per_node"]
  autoscaling {
    min_node_count = each.value["min_node_count"]
    max_node_count = each.value["max_node_count"]
  }
  node_config {
    disk_size_gb = each.value["disk_size_gb"]
    disk_type = each.value["disk_type"]
    guest_accelerator {
      type = each.value["guest_accelerator_type"]
      count = each.value["guest_accelerator_count"]
    }
    machine_type = each.value["machine_type"]
    preemptible  = each.value["preemptible"]
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/cloud-platform",
    ]
    service_account = join(",", [
      for sa in data.google_service_account.default:
        sa.email
    ])
    tags = toset([data.google_project.project.project_id, var.project, "gke-${var.cluster_name}"])
    labels = each.value["labels"]
    taint  = each.value["taints"]
  }

  lifecycle {
    ignore_changes = [ node_config[0].taint, initial_node_count ]
  }
}

variable:

variable "node_pools" {
  type = map(object({
    node_locations = list(string)
    min_node_count = number,
    max_node_count = number,
    max_pods_per_node = number,
    disk_size_gb = number,
    disk_type = string,
    guest_accelerator_type = string,
    guest_accelerator_count = number,
    machine_type = string,
    preemptible  = bool,
    labels = map(string),
    taints = list(object({
      key = string,
      value = string,
      effect = string
    }))
  }))
  default = {
    node1 = {
      node_locations = []
      min_node_count = 1,
      max_node_count = 2,
      max_pods_per_node = 110,
      disk_size_gb = 10,
      disk_type = "pd-standard",
      guest_accelerator_type = "nvidia-tesla-k80",
      guest_accelerator_count = 0,
      machine_type = "e2-medium",
      preemptible  = true,
      labels = {},
      taints = [
         {
             "effect": "NO_SCHEDULE",
               "key": "node-purpose",
                "value": "platform-apps"
          }
      ]
    }
  }
}

Error: │ Error: Unsupported argument │ │ on main.tf line 219, in resource "google_container_node_pool" "taint_nodes": │ 219: taint = each.value["taints"] │ │ An argument named "taint" is not expected here. Did you mean to define a │ block of type "taint"?

It is working fine with version 4.62.1 but not with 5.0.0, how to resolve this? Appriciate the help.

b/313874078

edwardmedia commented 10 months ago

@adviteey1 Can you try to put taint in google_container_cluster instead?

adviteey1 commented 10 months ago

@edwardmedia Thank you for the response, It will work if I put taint in google_container_cluster. In both versions (4.x.x and 5.x.x) terraform document doesn't have any reference to taint in google_container_node_pool but in google_container_cluster. Am I missing or need to change any thing to work with 5.x.x ?

edwardmedia commented 9 months ago

I can repro the issue. Switching the version in the provider block, we can see a different result

resource "google_service_account" "default" {
  account_id   = "service-account-id"
  display_name = "Service Account"
}

resource "google_container_cluster" "primary" {
  name     = "my-gke-cluster"
  location = "us-central1"
  remove_default_node_pool = true
  initial_node_count       = 1
}

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  cluster    = google_container_cluster.primary.id
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "e2-medium"

    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    service_account = google_service_account.default.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]

    taint  = [
         {
             "effect": "NO_SCHEDULE",
               "key": "node-purpose",
                "value": "platform-apps"
          }
      ]
  }
}

terraform {
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "5.5.0"
      # version = "4.62.1"
    }
  }
}
adviteey1 commented 9 months ago

Hi, I see label changed to bug, is it bug in 5.x.x version?

zzorica commented 9 months ago

I tried 5.5 till 5.8 provider versions and its the same bug. So we can't set taint on the nodes in the pool atm...

sanadhis commented 9 months ago

Hi, I also face the same issue, basically removing the previously taint argument doesn't trigger any changes at all.

Terraform version: 1.5.3 google provider version: 5.8.0

Test:

resource "google_container_node_pool" "node-pool" {
  ...

  node_config {
    ...

    #Test
    #taint = var.taints
  }

^this results in zero change

zzorica commented 9 months ago

Ok so its a little diff syntaks now. This one will work:

    taint {
      effect = "NO_SCHEDULE"
      key    = "key1"
      value  = "value1"
    }

    taint {
      effect = "NO_SCHEDULE"
      key    = "key2"
      value  = "value2"
    }

Tested in 5.8 and all good.

danninov commented 8 months ago

Any update for this?

xpicio commented 7 months ago

Hello, any news about it ?

I'm using:

Terraform v1.4.2
on darwin_arm64
+ provider registry.terraform.io/hashicorp/google v5.15.0
+ provider registry.terraform.io/hashicorp/google-beta v5.15.0

and i'm facing with the same issue:

❯ terraform apply        
╷
│ Error: Unsupported argument
│ 
│   on container-engine.tf line 210, in resource "google_container_node_pool" "container_node_pool_kungfu":
│  210:     taint = [
│ 
│ An argument named "taint" is not expected here. Did you mean to define a block of type "taint"?

The taint can be created on the cluster or on the node pool, and in my use case i can't use the cluster because i have a cluster with multiple node pools.

Have a nice day 👋

adviteey1 commented 7 months ago

As @zzorica mentioned syntax changed. Below is working for me,

taint { effect = "NO_SCHEDULE" key = "key1" value = "value1" }

If you have multiple use dynamic resorce as below.

dynamic "taint" {
  for_each = var.taints
  content { 
    key = key
    value = tvalue
    effect = effect
  }
ericitaquera commented 7 months ago

Hello!

I´m from the future!

I´d like to tell you all that today (19-Fev-2024) the documentation (for all provider versions I looked at) still lacks this information.

It´d be great to have it updated, saving our lifetime to more valuable stuff.

https://registry.terraform.io/providers/hashicorp/google-beta/latest/docs/resources/container_node_pool

Best Regards!

shivannakarthik commented 7 months ago

Below one worked for me. What is the process to update docs?

resource "google_container_node_pool" "special-pool" {
  node_config {
    taint {
      key    = "special"
      value  = "special-value"
      effect = "special-effect"
    }
  }
}
MMirelli commented 3 months ago

Ok so its a little diff syntaks now. This one will work:

    taint {
      effect = "NO_SCHEDULE"
      key    = "key1"
      value  = "value1"
    }

    taint {
      effect = "NO_SCHEDULE"
      key    = "key2"
      value  = "value2"
    }

Tested in 5.8 and all good.

Thank you @zzorica! This works also for v5.34.0.

v5.34.0 doc is still confusing: https://registry.terraform.io/providers/hashicorp/google/5.34.0/docs/resources/container_cluster#taint says "A list of Kubernetes taints to apply to nodes".