hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.51k stars 4.6k forks source link

`azurerm_kubernetes_cluster` & `azurerm_kubernetes_cluster_node_pool ` - `os_sku` change throws PropertyChangeNotAllowed #26730

Closed cberge908 closed 3 weeks ago

cberge908 commented 1 month ago

Is there an existing issue for this?

Community Note

Terraform Version

OpenTofu v1.7.3

AzureRM Provider Version

v3.113.0

Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster, azurerm_kubernetes_cluster_node_pool

Terraform Configuration Files

module "tf-aks" {
  source                = ***REDACTED***
  use_naming_convention = false
  depends_on            = [module.tf-aks-routetable] # Description: The AKS will be installed after the Azure route table was created. (list)    

  resource_data = {
    name                      = "aks-${local.sid}"
    resource_group_name       = data.azurerm_resource_group.this.name # Description: Specifies the Resource Group where the Managed Kubernetes Cluster should exist. Changing this forces a new resource to be created. (string)
    location                  = data.azurerm_resource_group.this.location
    dns_prefix                = "aks-${local.sid}" # Description: DNS prefix specified when creating the managed cluster. Changing this forces a new resource to be created. (string)
    kubernetes_version        = "1.30"
    sku_tier                  = "Standard"
    automatic_channel_upgrade = "patch"
    node_os_channel_upgrade   = "NodeImage"
    local_account_disabled    = false
    cost_analysis_enabled     = true

    api_server_access_profile = {
      authorized_ip_ranges = ***REDACTED***
    }

    network_profile = {
      outbound_type       = "userDefinedRouting"
      network_plugin      = "azure"
      network_plugin_mode = "overlay"
      network_mode        = "transparent"
      network_policy      = "azure"       # Description: Set this variable to "calico" in case network policies should be enabled. Leave it to <null> in case it is not needed. (string)
      dns_service_ip      = "10.0.0.10"   # Description: IP address within the Kubernetes service address range that will be used by cluster service discovery (kube-dns). Changing this forces a new resource to be created. (string)
      service_cidr        = "10.0.0.0/16" # Description: The Network Range used by the Kubernetes service. Changing this forces a new resource to be created. (string)
    }

    azure_active_directory_role_based_access_control = {
      tenant_id              = local.tenant_id
      managed                = true
      admin_group_object_ids = [***REDACTED***]
    }

    identity = {
      identity_ids = [data.azurerm_user_assigned_identity.msi.id]
      type         = "UserAssigned"
    }

    default_node_pool = {
      vnet_subnet_id       = data.azurerm_subnet.this.id # Description: The ID of a Subnet where the Kubernetes Node Pool should exist. Changing this forces a new resource to be created. (string)
      name                 = "default"                   # Action required. Description: The name of the default nodepool of this AKS, e.g. <default>. (string)
      orchestrator_version = "1.30"                      # Action required. Description: The desired Kubernetes agent version of the default nodepool. (string)
      vm_size              = "Standard_D8s_v5"           # Action required. Description: The virtual machine type used for the worker nodes, e.g. <Standard_B2s>. (string)
      node_count           = 6                           # Action required. Description: The amount of worker nodes of the AKS. Set this variable to <null> in case autoscaling is used. (number)
      enable_autoscaling   = false                       # Description: Decide if autoscaling should be activated (true) or deactivated (false). (boolean)
      min_count            = null                        # Description: The minimum amount of worker nodes of the AKS in case autoscaling is set to <true>. In case autoscaling is set to <false>, then set this variable to <null>. (number)
      max_count            = null                        # Description: The minimum amount of worker nodes of the AKS in case autoscaling is set to <true>. In case autoscaling is set to <false>, then set this variable to <null>. (number)
      max_pods             = 110                         # Description: The maximum number of pods that can run on each agent. Changing this forces a new resource to be created. (number)
      osdisk_size_gb       = 120                         # Description: The size of the OS Disk which should be used for each agent in the Node Pool. Changing this forces a new resource to be created. (number)
      zones                = ["1", "2", "3"]             # Description: A list of Availability Zones across which the Node Pool should be spread. Changing this forces a new resource to be created. (list(string))
      os_sku               = "AzureLinux"

      upgrade_settings = {
        drain_timeout_in_minutes = 30
      }

    }

    key_vault_secrets_provider = { # Description: (Optional) - Enables the secret provider CSI driver within the AKS cluster. (boolean)
      secret_rotation_enabled  = true
      secret_rotation_interval = "2m" # Description: (Optional) - Sets the rotation interval for secrets coming from the secrets provider CSI driver. Variable aks_enable_key_vaults_secret_provider needs to be set to 'true', otherwise this setting has no effect. Default is '2m'. (string)
    }

    maintenance_window_auto_upgrade = {
      frequency   = "Weekly"
      interval    = 1
      duration    = 4
      day_of_week = "Sunday"
      start_time  = "16:00"
      utc_offset  = "+01:00"
    }

    maintenance_window_node_os = {
      frequency   = "Weekly"
      interval    = 1
      duration    = 4
      day_of_week = "Sunday"
      start_time  = "16:00"
      utc_offset  = "+01:00"
    }

    tags = merge(***REDACTED***)

  }
}

Debug Output/Panic Output

│ Error: updating Default Node Pool Agent Pool (Subscription: ***REDACTED***
│ Resource Group Name: ***REDACTED***
│ Managed Cluster Name: ***REDACTED***
│ Agent Pool Name: "default") performing CreateOrUpdate: unexpected status 400 (400 Bad Request) with response: {
│   "code": "PropertyChangeNotAllowed",
│   "details": null,
│   "message": "Changing property 'agentPoolProfile.OSSKU' is not allowed.",
│   "subcode": "",
│   "target": "agentPoolProfile.OSSKU"
│  }
│
│   with module.tf-aks.azurerm_kubernetes_cluster.this,
│   on .terraform/modules/tf-aks/main.tf line 44, in resource "azurerm_kubernetes_cluster" "this":
│   44: resource "azurerm_kubernetes_cluster" "this" {
│

Expected Behaviour

OS SKU has been changed to AzureLinux. The original setting was "Ubuntu". We would expect that the nodepool is being updated with the AzureLinux image, according to https://learn.microsoft.com/en-us/azure/azure-linux/tutorial-azure-linux-migration?tabs=terraform

Actual Behaviour

tofu apply / terraform apply fails with the above mentioned error.

Steps to Reproduce

  1. Change os_sku on existing AKS cluster with the current setting Ubuntu to AzureLinux
  2. Run terraform apply or tofu apply
  3. Receive the above mentioned error

Important Factoids

none

References

No response

ms-henglu commented 1 month ago

Hi @rcskosir ,

Thank you for taking time to report this issue.

This requires that the Preview Feature Microsoft.ContainerService/OSSKUMigrationPreview is enabled and the Resource Provider is re-registered, see the documentation for more information.

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster#os_sku

cberge908 commented 1 month ago

Hi @ms-henglu

the MS documentation does not mention this preview feature on the page anymore. I can also not find this feature in the list of preview feature that could be enabled in my subscriptions.

Side note: Due to the fact that this preview feature is not existing anymore, I've opened a PR yesterday to remove exactly this info box from the provider documentation - https://github.com/hashicorp/terraform-provider-azurerm/pull/26719

Any other idea?

Cheers, @cberge908

ptrautberg commented 1 month ago

@rcskosir, error PropertyChangeNotAllowed isn't related to AzureRM provider - it's an AKS API response to OS SKU change request. The reason of that situation is that your cluster isn't supporting yet that functionality - also like mine. It is being still deployed to some regions, so you have to wait until your cluster will get 2024-07-16 release.

image

cberge908 commented 3 weeks ago

@ptrautberg thanks for the input. As this is related to the AKS API we can close this issue from my point of view. We were able to change the property in the meantime.