terraform-google-modules / terraform-google-kubernetes-engine

Configures opinionated GKE clusters
https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google
Apache License 2.0
1.14k stars 1.17k forks source link

Error to create private-cluster: Error 400 - Request contains an invalid argument., badRequest - resource google_container_cluster.primary #2016

Closed aeciopires closed 2 months ago

aeciopires commented 2 months ago

TL;DR

I'm trying create a standard private cluster using terraform-google-modules/kubernetes-engine/google//modules/private-cluster?version=31.1.0, but I receive the error message:

Error: googleapi: Error 400: Request contains an invalid argument., badRequest
│ 
│   with google_container_cluster.primary,
│   on cluster.tf line 22, in resource "google_container_cluster" "primary":
│   22: resource "google_container_cluster" "primary" {

I'm using Terraform 1.9.2 and Terragrunt 0.63.2.

Can I help me, please?

Expected behavior

Create a standard private cluster using terraform-google-modules/kubernetes-engine/google//modules/private-cluster?version=31.1.0

Observed behavior

terragrunt plan was successful, but terragrunt apply show this messages with the error:

Warning: xpto is a fake name.

Error message:

random_string.cluster_service_account_suffix: Creating...
random_string.cluster_service_account_suffix: Creation complete after 0s [id=jaru]
google_service_account.cluster_service_account[0]: Creating...
google_service_account.cluster_service_account[0]: Still creating... [10s elapsed]
google_service_account.cluster_service_account[0]: Creation complete after 13s [id=projects/xpto/serviceAccounts/tf-gke-xpto-jaru@xpto.iam.gserviceaccount.com]
google_project_iam_member.cluster_service_account-nodeService_account[0]: Creating...
google_project_iam_member.cluster_service_account-resourceMetadata-writer[0]: Creating...
google_project_iam_member.cluster_service_account-artifact-registry["xpto"]: Creating...
google_project_iam_member.cluster_service_account-gcr["xpto"]: Creating...
google_project_iam_member.cluster_service_account-metric_writer[0]: Creating...
google_container_cluster.primary: Creating...
google_project_iam_member.cluster_service_account-artifact-registry["xpto"]: Creation complete after 9s [id=xpto/roles/artifactregistry.reader/serviceAccount:tf-gke-xptot-jaru@xpto.iam.gserviceaccount.com]
google_project_iam_member.cluster_service_account-gcr["xpto"]: Still creating... [10s elapsed]
google_project_iam_member.cluster_service_account-nodeService_account[0]: Still creating... [10s elapsed]
google_project_iam_member.cluster_service_account-metric_writer[0]: Still creating... [10s elapsed]
google_project_iam_member.cluster_service_account-resourceMetadata-writer[0]: Still creating... [10s elapsed]
google_project_iam_member.cluster_service_account-nodeService_account[0]: Creation complete after 10s [id=xpto/roles/container.defaultNodeServiceAccount/serviceAccount:tf-gke-xpto-jaru@xpto.iam.gserviceaccount.com]
google_project_iam_member.cluster_service_account-resourceMetadata-writer[0]: Creation complete after 10s [id=xpto/roles/stackdriver.resourceMetadata.writer/serviceAccount:tf-gke-xpto-st-jaru@xpto.iam.gserviceaccount.com]
google_project_iam_member.cluster_service_account-gcr["xpto"]: Creation complete after 10s [id=xpto/roles/storage.objectViewer/serviceAccount:tf-gke-xpto-jaru@xpto.iam.gserviceaccount.com]
google_project_iam_member.cluster_service_account-metric_writer[0]: Creation complete after 11s [id=xpto/roles/monitoring.metricWriter/serviceAccount:tf-gke-xpto-jaru@xpto.iam.gserviceaccount.com]

Error: googleapi: Error 400: Request contains an invalid argument., badRequest
│ 
│   with google_container_cluster.primary,
│   on cluster.tf line 22, in resource "google_container_cluster" "primary":
│   22: resource "google_container_cluster" "primary" {

Terraform Configuration

Terragrunt module inputs used:

Warning: xpto is a fake name.

  #--------------------------
  # General
  #--------------------------
  name                 = "xpto"
  project_id         = "xpto"
  region                = "us-central1"
  regional             = true
  node_locations = ["us-central1-a", "us-central1-b"]

  cluster_dns_provider   = "PLATFORM_DEFAULT"
  dns_cache                    = false

  # The Kubernetes version of the master and nodes in the cluster.
  # More info:
  # https://cloud.google.com/kubernetes-engine/docs/release-notes
  # https://cloud.google.com/kubernetes-engine/docs/release-schedule
  kubernetes_version = "1.29.6-gke.1038001"
  # The release channel of this cluster. Accepted values are `UNSPECIFIED`, `RAPID`, `REGULAR` and `STABLE`. Defaults to `REGULAR`.
  release_channel    = "REGULAR"

  #--------------------------
  # Registry
  #--------------------------
  grant_registry_access = true

  #--------------------------
  # Addons
  #--------------------------
  http_load_balancing                       = true
  enable_l4_ilb_subsetting               = true
  gce_pd_csi_driver                          = true
  filestore_csi_driver                         = true
  enable_vertical_pod_autoscaling = false
  horizontal_pod_autoscaling          = true

  #--------------------------
  # Maintenance
  #--------------------------
  maintenance_start_time  = "02:00"
  maintenance_end_time    = "" # Duration is 4 hours
  maintenance_recurrence = "" # Daily maintenance window

  #--------------------------
  # Network
  #--------------------------
  #master_ipv4_cidr_block        = "x.x.x.x/28"
  network                                     = "xpto"
  network_project_id                 = "xpto"
  subnetwork                              = "xpto"
  ip_range_pods                         = "xpto"
  ip_range_services                    = "xpto"
  zones                                         = ["us-central1-a", "us-central1-b"]
  stack_type                                = "IPV4"
  master_global_access_enabled  = true
  deploy_using_private_endpoint = false
  enable_private_endpoint       = false
  enable_private_nodes          = true
  #network_tags                  = ["terraform", "xpto"]
  deletion_protection           = false
  master_authorized_networks    = [
    {
      cidr_block   = "x.x.x.x/32"
      display_name = "Home"
    }
  ]

  #--------------------------
  # Storage, log and monitoring
  #--------------------------
  logging_service    = "logging.googleapis.com/kubernetes"
  monitoring_service = "monitoring.googleapis.com/kubernetes"

  monitoring_enable_managed_prometheus    = true
  monitoring_enable_observability_metrics = true
  monitoring_enabled_components           = [
    "SYSTEM_COMPONENTS",
    "WORKLOADS",
    "STORAGE",
    "POD",
    "DEPLOYMENT",
    "STATEFULSET",
    "DAEMONSET",
    "HPA",
    "CADVISOR",
    "KUBELET"
  ]

  #--------------------------
  # Worker node
  #--------------------------
  remove_default_node_pool = true
  initial_node_count       = 0
  # Reference: https://cloud.google.com/compute/docs/general-purpose-machines?hl=pt-br
  # Node pool names must start with a lowercase letter followed by up to 39 lowercase
  node_pools = [{
    name               = "xpto-np1"
    initial_node_count = 2  # Considering the same number of nodes per zone: a and b
    min_count          = 2  # Considering the same number of nodes per zone: a and b
    max_count          = 20 # Considering the same number of nodes per zone: a and b
    machine_type       = "n1-standard-4"
    disk_size_gb       = 50
    disk_type          = "pd-ssd"
    auto_repair        = true
    auto_upgrade       = true
    preemptible        = false
    max_pods_per_node  = 110
  }]

  cluster_autoscaling = {
    enabled             = true
    autoscaling_profile = "BALANCED"
    disk_size           = 50
    disk_type           = "pd-ssd"
    min_cpu_cores       = 1
    max_cpu_cores       = 80   # CPU cores of machine type multiplied by max_count
    min_memory_gb       = 1
    max_memory_gb       = 300  # Memory of machine type multiplied by max_count
    gpu_resources       = []
    image_type          = "COS_CONTAINERD"
    auto_repair         = true
    auto_upgrade        = true
    strategy            = "SURGE"
    max_surge           = 1
    max_unavailable     = 0
  }

  node_pools_labels = {
    all = merge(
      local.default_tags,
      {
        cluster = "xpto"
      }
    )
  }

  # Reference: https://cloud.google.com/kubernetes-engine/docs/how-to/access-scopes?hl=pt-br
  node_pools_oauth_scopes = {
    all = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/service.management.readonly",
      "https://www.googleapis.com/auth/servicecontrol",
      "https://www.googleapis.com/auth/trace.append"
    ]
  }

  node_pools_tags = {
    all = ["terraform", "xpto"]
  }

  cluster_resource_labels = merge(
    local.default_tags,
    {
      cluster = "xpto"
    }
  )

Terraform Version

Terraform: 1.9.2
Terragrunt: 0.63.2

Additional information

I get create the cluster using web console (clickops approach) with the same configurations twice, but I can't create cluster using terraform module.

I'm using terragrunt and terraform together and terraform module: "terraform-google-modules/kubernetes-engine/google//modules/private-cluster?version=31.1.0"

aeciopires commented 2 months ago

Hi, guys!

Luiz Vinhas help me and solved the problem.

I create the cluster using just this inputs...

I tested twice.

Warning: xpto is a fake name.

  #--------------------------
  # General
  #--------------------------
  name       = "xpto"
  project_id = "xpto"
  region     = "us-central1"
  regional   = true

  #--------------------------
  # MISCELLANEOUS
  #--------------------------
  service_account        = "xpto"
  create_service_account = false

  # The Kubernetes version of the master and nodes in the cluster.
  # More info:
  # https://cloud.google.com/kubernetes-engine/docs/release-notes
  # https://cloud.google.com/kubernetes-engine/docs/release-schedule
  kubernetes_version = "1.29.6-gke.1254000"
  # The release channel of this cluster. Accepted values are `UNSPECIFIED`, `RAPID`, `REGULAR` and `STABLE`. Defaults to `REGULAR`.
  release_channel    = "REGULAR"

  #--------------------------
  # Registry
  #--------------------------
  grant_registry_access = true

  #--------------------------
  # Addons
  #--------------------------
  http_load_balancing             = true
  enable_l4_ilb_subsetting        = false
  gce_pd_csi_driver               = true
  filestore_csi_driver            = false
  enable_vertical_pod_autoscaling = false
  horizontal_pod_autoscaling      = true

  #--------------------------
  # Maintenance
  #--------------------------
  maintenance_start_time = "02:00"
  maintenance_end_time   = "" # Duration is 4 hours
  maintenance_recurrence = "" # Daily maintenance window

  #--------------------------
  # Network
  #--------------------------
  master_ipv4_cidr_block       = "X.X.X.X/28"
  network                      = "xpto"
  network_project_id           = "xpto"
  subnetwork                   = "xpto"
  ip_range_pods                = "xpto"
  ip_range_services            = "xpto"
  zones                        = ["us-central1-a", "us-central1-b"]
  master_global_access_enabled = true
  enable_private_endpoint      = false
  enable_private_nodes         = true
  deletion_protection          = false
  master_authorized_networks   = [
    {
      cidr_block   = "X.X.X.X/32"
      display_name = "Home"
    }
  ]

  #--------------------------
  # Worker node
  #--------------------------
  remove_default_node_pool = true
  initial_node_count       = 0
  # Reference: https://cloud.google.com/compute/docs/general-purpose-machines?hl=pt-br
  # Node pool names must start with a lowercase letter followed by up to 39 lowercase
  node_pools = [
    {
      # spot nodepool
      name               = "xpto-spt1"
      spot               = true
      service_account    = "xpto"
      initial_node_count = 2 # Considering the same number of nodes per zone: a and b
      min_count          = 1
      max_count          = 20
      machine_type       = "n1-standard-4"
      node_locations     = "us-central1-a,us-central1-b"
      image_type         = "COS_CONTAINERD"
      disk_size_gb       = 50
      disk_type          = "pd-ssd"
      auto_repair        = true
      auto_upgrade       = true
      preemptible        = false
      max_pods_per_node  = 110
    },
    {
      # on-demand nodepool
      name               = "xpto-odm1"
      spot               = false
      service_account    = "xpto"
      initial_node_count = 2  # Considering the same number of nodes per zone: a and b
      min_count          = 1
      max_count          = 20
      machine_type       = "n1-standard-4"
      node_locations     = "us-central1-a,us-central1-b"
      image_type         = "COS_CONTAINERD"
      disk_size_gb       = 50
      disk_type          = "pd-ssd"
      auto_repair        = true
      auto_upgrade       = true
      preemptible        = false
      max_pods_per_node  = 110
    },
  ]

  cluster_autoscaling = {
    enabled             = true
    autoscaling_profile = "BALANCED"
    disk_size           = 50
    disk_type           = "pd-ssd"
    min_cpu_cores       = 1
    max_cpu_cores       = 80   # CPU cores of machine type multiplied by max_count
    min_memory_gb       = 1
    max_memory_gb       = 300  # Memory of machine type multiplied by max_count
    gpu_resources       = []
    image_type          = "COS_CONTAINERD"
    auto_repair         = true
    auto_upgrade        = true
    strategy            = "SURGE"
    max_surge           = 1
    max_unavailable     = 0
  }

  node_pools_labels = {
    all = merge(
      local.default_tags,
      {
        cluster = "xpto"
      }
    )
  }

  cluster_resource_labels = merge(
    local.default_tags,
    {
      cluster = local.cluster_shortname
    }
  )