terraform-google-modules / terraform-google-kubernetes-engine

Configures opinionated GKE clusters
https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google
Apache License 2.0
1.14k stars 1.17k forks source link

`master_ipv4_cidr_block` is not optional on private autopilot cluster if `add_cluster_firewall_rules` is true #1938

Closed fpacifici closed 1 month ago

fpacifici commented 4 months ago

TL;DR

Version 30.3.0 made master_ipv4_cidr_block default to null for private autopilot clusters. If add_cluster_firewall_rules is true and master_ipv4_cidr_block is not provided plan and apply fail with a tricky error to troubleshoot (null value in a list).

Expected behavior

I am not 100% sure what the expected behavior should be here. I see the reason for not defaulting to 10.0.0.0/28 (https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/1902). Though I am not sure whether it should be possible to enable add_cluster_firewall_rules without specifying a master_ipv4_cidr_block for the cluster.

Either way, I think a clearer validation error message or docs would be a good idea (see observed behavior).

Observed behavior

When creating a private autopilot cluster with add_cluster_firewall_rules, if I do not set master_ipv4_cidr_block explicitly, rather than getting a validation error I get this:

│ Error: Null value found in list                                                          
│                                            
│   with module.gke.google_compute_firewall.intra_egress[0],                               
│   on .terraform/modules/gke/modules/beta-autopilot-private-cluster/firewall.tf line 37, in resource "google_compute_firewall" "intra_egress":
│   37:   destination_ranges = concat([                                         
│   38:     local.cluster_endpoint_for_nodes,                                                                                                                    
│   39:     local.cluster_subnet_cidr,
│   40:     ],                                                                             
│   41:     local.pod_all_ip_ranges                                                        
│   42:   )                                                                                
│                                            
│ Null values are not allowed for this attribute value.                                    
╵                                            
╷                                            
│ Error: Null value found in list            
│                                                                                          
│   with module.gke.google_compute_firewall.master_webhooks[0],                                                                                                  
│   on .terraform/modules/gke/modules/beta-autopilot-private-cluster/firewall.tf line 102, in resource "google_compute_firewall" "master_webhooks":
│  102:   source_ranges = [local.cluster_endpoint_for_nodes]                                                                                                     
│                                            
│ Null values are not allowed for this attribute value.
╵                                                                                          
╷                                                                                          
│ Error: Null value found in list                                                          
│                                                                                          
│   with module.gke.google_compute_firewall.shadow_allow_master[0],                        
│   on .terraform/modules/gke/modules/beta-autopilot-private-cluster/firewall.tf line 158, in resource "google_compute_firewall" "shadow_allow_master":
│  158:   source_ranges = [local.cluster_endpoint_for_nodes]                               
│                  
│ Null values are not allowed for this attribute value.                                                                                                          

Terraform Configuration

- Create a GCP project with a GCS bucket for the terraform state (pretty much any setting would work and you do not need to save state in GCS)
- Create a network and a subnetwork (how it is set up does not matter much to trigger this issue)

Plan this

terraform {
  backend "gcs" {
    bucket = "your bucket"
    prefix = "your prefix"
  }
}

provider "google" {
  project = "YOUR_PROJECT"
  billing_project       = "YOUR_PROJECT"
  user_project_override = true
}
provider "google-beta" {
  project = "YOUR_PROJECT"
  billing_project       = "YOUR_PROJECT"
  user_project_override = true
}

terraform {
  required_version = "~> 1.4"
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.22"
    }
  }
}

module "gke" {
  source  = "terraform-google-modules/kubernetes-engine/google//modules/beta-autopilot-private-cluster"
  version = "~> 30.2"

  name            = "gke"
  release_channel = "RAPID"

  project_id        = YOUR_PROJECT
  region            = TOUR_REGION
  zones             = [YOUR_ZONES]
  network           = YOUR_NETWORK
  subnetwork        = YOUR_SUBNET

  # no direct access from public internet
  enable_private_endpoint = true
  enable_private_nodes    = true
  master_authorized_networks = [
    {
      display_name = "private /16"
      cidr_block   = "192.168.0.0/16"
    },
    {
      display_name = "private /12"
      cidr_block   = "172.16.0.0/12"
    },
    {
      display_name = "private /8"
      cidr_block   = "10.0.0.0/8"
    },
  ]

  add_cluster_firewall_rules        = true
  add_master_webhook_firewall_rules = true
  add_shadow_firewall_rules         = true

  deletion_protection = false

  allow_net_admin = true
}

I can provide more details if needed, but I noticed that the issue is fairly easy to repro as long as add_cluster_firewall_rules is true and master_ipv4_cidr_block is not provided. The stack trace above shows where master_ipv4_cidr_block is expected to not be null.


### Terraform Version

```sh
Terraform v1.8.1
on darwin_arm64

### Additional information

The issue is trivial to spot here
https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/master/firewall.tf#L38

cluster endpoint for nodes is set to `var.master_ipv4_cidr_block`  in main.tf

cluster_endpoint_for_nodes = var.master_ipv4_cidr_block



So as long as we try to create that firewall rule and master_ipv4_cidr_block is null, plan fails.
DevoFalcon commented 3 months ago

Any workaround for this? Or is it just to manage the firewall rules ourselves?

edit: Looks like autopilot clusters don't support mutatting webhooks (https://cloud.google.com/kubernetes-engine/enterprise/policy-controller/docs/how-to/mutation). Seems like add_cluster_firewall_rules and master_ipv4_cidr_block are not intended to be used on private autopilot clusters if the master CIDR block is not set (or because the master CIDR is not exposed?)

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days