Terraform wants to replace my cluster

Israphel commented 4 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Greenfield/Brownfield provisioning

brownfield

Terraform Version

1.5.5

Module Version

8.0.0

AzureRM Provider Version

3.106.0

Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

# AKS clusters (EU North)
module "aks-eu-north" {
  source   = "Azure/aks/azurerm"
  version  = "8.0.0"
  for_each = local.config[local.environment]["aks"]["eu-north"]

  prefix                            = each.value.name
  resource_group_name               = module.resource-group-eu-north["default"].name
  node_resource_group               = "${each.value.name}-nodes"
  kubernetes_version                = each.value.kubernetes_version.control_plane
  orchestrator_version              = each.value.kubernetes_version.node_pool
  oidc_issuer_enabled               = true
  workload_identity_enabled         = true
  agents_pool_name                  = "default"
  agents_availability_zones         = ["1", "2", "3"]
  agents_type                       = "VirtualMachineScaleSets"
  agents_size                       = try(each.value.agents_size, "Standard_D2s_v3")
  temporary_name_for_rotation       = "tmp"
  enable_auto_scaling               = true
  agents_count                      = null
  agents_min_count                  = try(each.value.agents_min_count, 1)
  agents_max_count                  = try(each.value.agents_max_count, 3)
  azure_policy_enabled              = true
  log_analytics_workspace_enabled   = try(each.value.log_analytics_workspace_enabled, true)
  log_retention_in_days             = try(each.value.log_retention_in_days, 30)
  network_plugin                    = "azure"
  load_balancer_sku                 = "standard"
  ebpf_data_plane                   = "cilium"
  os_disk_size_gb                   = try(each.value.os_disk_size_gb, 30)
  rbac_aad                          = true
  rbac_aad_managed                  = true
  rbac_aad_azure_rbac_enabled       = true
  role_based_access_control_enabled = true
  rbac_aad_admin_group_object_ids   = [local.inputs["groups"]["infra"]]
  sku_tier                          = "Standard"
  vnet_subnet_id                    = module.virtual-network-eu-north["default"].vnet_subnets_name_id["nodes"]
  pod_subnet_id                     = module.virtual-network-eu-north["default"].vnet_subnets_name_id["pods"]
  agents_labels                     = try(each.value.agents_labels, {})
  agents_tags                       = try(each.value.agents_tags, {})

  tags = {
    environment = local.environment
    region      = module.resource-group-eu-north["default"].location
    managed_by  = "terraform"
  }

  providers = {
    azurerm = azurerm.eu-north
  }
}

tfvars variables values

    eun-1:
      name: eun-prod-1
      kubernetes_version:
        control_plane: 1.29.2
        node_pool: 1.29.2
      log_analytics_workspace_enabled: false
      agents_size: Standard_B4s_v2
      agents_min_count: 1
      agents_max_count: 8
      os_disk_size_gb: 60
      agents_labels:
        node.kubernetes.io/node-type: default

Debug Output/Panic Output

# module.aks-eu-north["eun-1"].azurerm_kubernetes_cluster.main must be replaced
+/- resource "azurerm_kubernetes_cluster" "main" {
      ~ api_server_authorized_ip_ranges     = [] -> (known after apply)
      - cost_analysis_enabled               = false -> null
      ~ current_kubernetes_version          = "1.29.2" -> (known after apply)
      - custom_ca_trust_certificates_base64 = [] -> null
      - enable_pod_security_policy          = false -> null
      ~ fqdn                                = "eun-prod-1-k2q3x6en.hcp.northeurope.azmk8s.io" -> (known after apply)
      - http_application_routing_enabled    = false -> null
      + http_application_routing_zone_name  = (known after apply)
      ~ id                                  = "/subscriptions/5181fe1e-1064-432e-8d21-5ad0d3f86e9b/resourceGroups/prod-eu-north/providers/Microsoft.ContainerService/managedClusters/eun-prod-1-aks" -> (known after apply)
      ~ kube_admin_config                   = (sensitive value)
      ~ kube_admin_config_raw               = (sensitive value)
      ~ kube_config                         = (sensitive value)
      ~ kube_config_raw                     = (sensitive value)
      - local_account_disabled              = false -> null
        name                                = "eun-prod-1-aks"
      ~ node_resource_group_id              = "/subscriptions/5181fe1e-1064-432e-8d21-5ad0d3f86e9b/resourceGroups/eun-prod-1-nodes" -> (known after apply)
      ~ oidc_issuer_url                     = "https://northeurope.oic.prod-aks.azure.com/e46dcf00-9155-4b3f-aabc-61af2e446cd1/b79048c1-3760-41d1-95c5-9000ee47978c/" -> (known after apply)
      - open_service_mesh_enabled           = false -> null
      ~ portal_fqdn                         = "eun-prod-1-k2q3x6en.portal.hcp.northeurope.azmk8s.io" -> (known after apply)
      + private_dns_zone_id                 = (known after apply)
      + private_fqdn                        = (known after apply)
        tags                                = {
            "environment" = "prod"
            "managed_by"  = "terraform"
            "region"      = "northeurope"
        }
        # (17 unchanged attributes hidden)

      - auto_scaler_profile {
          - balance_similar_node_groups      = false -> null
          - empty_bulk_delete_max            = "10" -> null
          - expander                         = "random" -> null
          - max_graceful_termination_sec     = "600" -> null
          - max_node_provisioning_time       = "15m" -> null
          - max_unready_nodes                = 3 -> null
          - max_unready_percentage           = 45 -> null
          - new_pod_scale_up_delay           = "0s" -> null
          - scale_down_delay_after_add       = "10m" -> null
          - scale_down_delay_after_delete    = "10s" -> null
          - scale_down_delay_after_failure   = "3m" -> null
          - scale_down_unneeded              = "10m" -> null
          - scale_down_unready               = "20m" -> null
          - scale_down_utilization_threshold = "0.5" -> null
          - scan_interval                    = "10s" -> null
          - skip_nodes_with_local_storage    = false -> null
          - skip_nodes_with_system_pods      = true -> null
        }

      ~ azure_active_directory_role_based_access_control {
          ~ tenant_id              = "e46dcf00-9155-4b3f-aabc-61af2e446cd1" -> (known after apply)
            # (3 unchanged attributes hidden)
        }

      ~ default_node_pool {
          - custom_ca_trust_enabled      = false -> null
          - fips_enabled                 = false -> null
          ~ kubelet_disk_type            = "OS" -> (known after apply)
          ~ max_pods                     = 250 -> (known after apply)
            name                         = "default"
          ~ node_count                   = 5 -> (known after apply)
          - node_taints                  = [] -> null
          - only_critical_addons_enabled = false -> null
          ~ os_sku                       = "Ubuntu" -> (known after apply)
            tags                         = {
                "environment" = "prod"
                "managed_by"  = "terraform"
                "region"      = "northeurope"
            }
          + workload_runtime             = (known after apply)
            # (17 unchanged attributes hidden)

          - upgrade_settings {
              - drain_timeout_in_minutes      = 30 -> null # forces replacement
              - max_surge                     = "10%" -> null
              - node_soak_duration_in_minutes = 10 -> null
            }
        }

      ~ identity {
          - identity_ids = [] -> null
          ~ principal_id = "c2a8e310-0f33-4c13-b2a7-f5003149e590" -> (known after apply)
          ~ tenant_id    = "e46dcf00-9155-4b3f-aabc-61af2e446cd1" -> (known after apply)
            # (1 unchanged attribute hidden)
        }

      - kubelet_identity {
          - client_id                 = "cffcdc3a-2c74-4f8f-9edc-6646572bb1d2" -> null
          - object_id                 = "89acc530-d5b8-405e-9e63-c791fb0ada3d" -> null
          - user_assigned_identity_id = "/subscriptions/5181fe1e-1064-432e-8d21-5ad0d3f86e9b/resourceGroups/eun-prod-1-nodes/providers/Microsoft.ManagedIdentity/userAssignedIdentities/eun-prod-1-aks-agentpool" -> null
        }

      ~ network_profile {
          ~ dns_service_ip          = "10.0.0.10" -> (known after apply)
          + docker_bridge_cidr      = (known after apply)
          ~ ip_versions             = [
              - "IPv4",
            ] -> (known after apply)
          + network_mode            = (known after apply)
          ~ network_policy          = "cilium" -> (known after apply)
          ~ outbound_ip_address_ids = [] -> (known after apply)
          ~ outbound_ip_prefix_ids  = [] -> (known after apply)
          + pod_cidr                = (known after apply)
          ~ pod_cidrs               = [] -> (known after apply)
          ~ service_cidr            = "10.0.0.0/16" -> (known after apply)
          ~ service_cidrs           = [
              - "10.0.0.0/16",
            ] -> (known after apply)
            # (4 unchanged attributes hidden)

          - load_balancer_profile {
              - effective_outbound_ips      = [
                  - "/subscriptions/5181fe1e-1064-432e-8d21-5ad0d3f86e9b/resourceGroups/eun-prod-1-nodes/providers/Microsoft.Network/publicIPAddresses/92c8f5cf-86d1-493f-8911-4d5bd3eb7205",
                ] -> null
              - idle_timeout_in_minutes     = 0 -> null
              - managed_outbound_ip_count   = 1 -> null
              - managed_outbound_ipv6_count = 0 -> null
              - outbound_ip_address_ids     = [] -> null
              - outbound_ip_prefix_ids      = [] -> null
              - outbound_ports_allocated    = 0 -> null
            }
        }

      - windows_profile {
          - admin_username = "azureuser" -> null
        }
    }

More info

The only thing I did was applying soak time via the command line since the module doesn't support it, but I wouldn't expect the whole cluster to be destroyed just for that.

The same issue doesn't occur with provider 3.105.0

zioproto commented 4 months ago

This is triggered in provider v3.106.0 because of this PR: https://github.com/hashicorp/terraform-provider-azurerm/pull/26137

tagging @ms-henglu and @stephybun

Is it correct that changing drain_timeout_in_minutes forces a replacement of the cluster ?

          - upgrade_settings {
              - drain_timeout_in_minutes      = 30 -> null # forces replacement
              - max_surge                     = "10%" -> null
              - node_soak_duration_in_minutes = 10 -> null
            }

I see in the docs that for nodepools --drain-timeout can be used both for add and update commands. Should be the same for the default nodepool.

Is it only the "unsetting" to null that forces the resource replacement ? Any other value would be accepted ?

https://learn.microsoft.com/en-us/azure/aks/upgrade-aks-cluster?tabs=azure-cli#set-node-drain-timeout-value

@Israphel the module does not support this feature yet, as tracked in https://github.com/Azure/terraform-azurerm-aks/issues/530 You should have not changed settings in AKS outside of Terraform, because this caused the Terraform state drift you are facing now.

I am not sure if you can revert the change with CLI so that the ARM API returns again drain_timeout_in_minutes = null to solve the Terraform state drift issue.

I suggest as temporary workaround to pin the Terraform provider version to v3.105.0 until this module supports the drain_timeout_in_minutes option.

Israphel commented 4 months ago

Thanks, went back to 105.

we're changing the default node group instance type and the rotation behaviour is extremely aggresive. Any way to make it better without the support of soak time, as of today?

zioproto commented 3 months ago

@Israphel this PR is now merged: https://github.com/Azure/terraform-azurerm-aks/pull/564

Is it possible for you to pin the module at commit 5858b260a1d6a9d2ee3687a08690e8932ca86af1 ?

for example:

  module "aks" {
    source = git::https://github.com/Azure/terraform-azurerm-aks.git?ref=5858b260a1d6a9d2ee3687a08690e8932ca86af1
    [..CUT..]

and then set your configuration for drain_timeout_in_minutes.

This should unblock you until there is a new release that includes the feature.

Please let us know if this works for you. Thanks

Israphel commented 3 months ago

hello. I actually got unblocked by going back to 105, then applying/refreshing, and then I could continue upgrading normally, the state drift was fixed.

lonegunmanb commented 3 months ago

I've tried to reproduce this issue by using the following config @Israphel @zioproto but I can't reproduce the same issue:

resource "random_id" "prefix" {
  byte_length = 8
}

resource "random_id" "name" {
  byte_length = 8
}

resource "azurerm_resource_group" "main" {
  count = var.create_resource_group ? 1 : 0

  location = var.location
  name     = coalesce(var.resource_group_name, "${random_id.prefix.hex}-rg")
}

locals {
  resource_group = {
    name     = var.create_resource_group ? azurerm_resource_group.main[0].name : var.resource_group_name
    location = var.location
  }
}

resource "azurerm_virtual_network" "test" {
  address_space       = ["10.52.0.0/16"]
  location            = local.resource_group.location
  name                = "${random_id.prefix.hex}-vn"
  resource_group_name = local.resource_group.name
}

resource "azurerm_subnet" "test" {
  address_prefixes                               = ["10.52.0.0/24"]
  name                                           = "${random_id.prefix.hex}-sn"
  resource_group_name                            = local.resource_group.name
  virtual_network_name                           = azurerm_virtual_network.test.name
  enforce_private_link_endpoint_network_policies = true
}

resource "azurerm_subnet" "pod" {
  address_prefixes = ["10.52.1.0/24"]
  name                 = "${random_id.prefix.hex}-pod"
  resource_group_name  = local.resource_group.name
  virtual_network_name = azurerm_virtual_network.test.name
  enforce_private_link_endpoint_network_policies = true
}

# resource "azurerm_resource_group" "nodepool" {
#   location = local.resource_group.location
#   name     = "f557-nodepool"
# }

module "aks-eu-north" {
  source   = "Azure/aks/azurerm"
  version  = "8.0.0"

  prefix                            = "f557"
  resource_group_name               = local.resource_group.name
  node_resource_group               = "f557-nodepool${random_id.name.hex}"
  kubernetes_version                = "1.29.2"
  orchestrator_version              = "1.29.2"
  oidc_issuer_enabled               = true
  workload_identity_enabled         = true
  agents_pool_name                  = "default"
  agents_availability_zones         = ["1", "2", "3"]
  agents_type                       = "VirtualMachineScaleSets"
  agents_size                       = try("Standard_B4s_v2", "Standard_D2s_v3")
  temporary_name_for_rotation       = "tmp"
  enable_auto_scaling               = true
  agents_count                      = null
  agents_min_count                  = 1
  agents_max_count                  = 8
  azure_policy_enabled              = true
  log_analytics_workspace_enabled   = false
  log_retention_in_days             = 30
  network_plugin                    = "azure"
  load_balancer_sku                 = "standard"
  ebpf_data_plane                   = "cilium"
  os_disk_size_gb                   = 60
  rbac_aad                          = true
  rbac_aad_managed                  = true
  rbac_aad_azure_rbac_enabled       = true
  role_based_access_control_enabled = true
#   rbac_aad_admin_group_object_ids   = [local.inputs["groups"]["infra"]]
  sku_tier                          = "Standard"
  vnet_subnet_id                    = azurerm_subnet.test.id
  pod_subnet_id                     = azurerm_subnet.pod.id
  agents_labels                     = {}
  agents_tags                       = {}
}

After apply, I updated soak duration time via Azure CLI:

az aks nodepool update --cluster-name f557-aks --resource-group ba4d95fcea318222-rg --name default --node-soak-duration 5

Then I ran terraform plan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # azurerm_subnet.pod will be updated in-place
  ~ resource "azurerm_subnet" "pod" {
        id                                             = "/subscriptions/xxxxxxxxxxxx/resourceGroups/ba4d95fcea318222-rg/providers/Microsoft.Network/virtualNetworks/ba4d95fcea318222-vn/subnets/ba4d95fcea318222-pod"
        name                                           = "ba4d95fcea318222-pod"
        # (11 unchanged attributes hidden)

      - delegation {
          - name = "aks-delegation" -> null

          - service_delegation {
              - actions = [
                  - "Microsoft.Network/virtualNetworks/subnets/join/action",
                ] -> null
              - name    = "Microsoft.ContainerService/managedClusters" -> null
            }
        }
    }

  # module.aks-eu-north.azurerm_kubernetes_cluster.main will be updated in-place
  ~ resource "azurerm_kubernetes_cluster" "main" {
        id                                  = "/subscriptions/xxxxxxxxxxxx/resourceGroups/ba4d95fcea318222-rg/providers/Microsoft.ContainerService/managedClusters/f557-aks"
        name                                = "f557-aks"
        tags                                = {}
        # (39 unchanged attributes hidden)

      ~ default_node_pool {
            name                          = "default"
            tags                          = {}
            # (33 unchanged attributes hidden)

          - upgrade_settings {
              - drain_timeout_in_minutes      = 0 -> null
              - max_surge                     = "10%" -> null
              - node_soak_duration_in_minutes = 0 -> null
            }
        }

        # (6 unchanged blocks hidden)
    }

Plan: 0 to add, 2 to change, 0 to destroy.

We were not able to reproduce this issue on our side.

We've also consulted the service team but we have no idea where this 30 came from, @Israphel could you please try to give us a minimum example that could reproduce this issue

Israphel commented 3 months ago

Try with:

az aks nodepool update --cluster-name f557-aks --resource-group ba4d95fcea318222-rg --name default --max-surge 10% --node-soak-duration 10 --drain-timeout 30

Azure / terraform-azurerm-aks