Open zioproto opened 1 year ago
To reproduce using this module this is the minimal code:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.56"
}
}
required_version = ">= 1.1.0"
}
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "example" {
name = "testResourceGroup2"
location = "eastus"
}
module "network" {
source = "Azure/network/azurerm"
vnet_name = azurerm_resource_group.example.name
resource_group_name = azurerm_resource_group.example.name
address_space = "10.52.0.0/16"
subnet_prefixes = ["10.52.0.0/16"]
subnet_names = ["system"]
use_for_each = true
depends_on = [azurerm_resource_group.example]
}
module "aks" {
source = "Azure/aks/azurerm"
version = "7.3.0"
resource_group_name = azurerm_resource_group.example.name
role_based_access_control_enabled = true
rbac_aad = false
prefix = "aks"
network_plugin = "azure"
vnet_subnet_id = module.network.vnet_subnets[0]
os_disk_size_gb = 50
sku_tier = "Standard"
private_cluster_enabled = false
enable_auto_scaling = true
agents_min_count = 1
agents_max_count = 5
agents_count = null # Please set `agents_count` `null` while `enable_auto_scaling` is `true` to avoid possible `agents_count` changes.
agents_max_pods = 100
agents_pool_name = "system"
agents_availability_zones = ["1", "2"]
agents_type = "VirtualMachineScaleSets"
agents_size = "Standard_DS3_v2"
ingress_application_gateway_enabled = false
network_policy = "azure"
net_profile_dns_service_ip = "10.0.0.10"
net_profile_service_cidr = "10.0.0.0/16"
storage_profile_enabled = true
storage_profile_blob_driver_enabled = true
network_contributor_role_assigned_subnet_ids = { "system" = module.network.vnet_subnets[0]}
depends_on = [module.network]
}
and this is the Terraform state drift after creating the first PersistentVolumeClaim
:
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply" which may have affected this plan:
# module.aks.azurerm_kubernetes_cluster.main has changed
~ resource "azurerm_kubernetes_cluster" "main" {
id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.ContainerService/managedClusters/aks-aks"
name = "aks-aks"
# (27 unchanged attributes hidden)
~ identity {
+ identity_ids = []
# (3 unchanged attributes hidden)
}
# (7 unchanged blocks hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to
undo or respond to these changes.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
+/- create replacement and then destroy
<= read (data resources)
Terraform will perform the following actions:
# module.aks.data.azurerm_resource_group.main will be read during apply
# (depends on a resource or a module with changes pending)
<= data "azurerm_resource_group" "main" {
+ id = (known after apply)
+ location = (known after apply)
+ managed_by = (known after apply)
+ name = "testResourceGroup2"
+ tags = (known after apply)
}
# module.aks.azurerm_kubernetes_cluster.main must be replaced
+/- resource "azurerm_kubernetes_cluster" "main" {
~ api_server_authorized_ip_ranges = [] -> (known after apply)
- custom_ca_trust_certificates_base64 = [] -> null
- enable_pod_security_policy = false -> null
~ fqdn = "aks-r8glnvzu.hcp.eastus.azmk8s.io" -> (known after apply)
+ http_application_routing_zone_name = (known after apply)
~ id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.ContainerService/managedClusters/aks-aks" -> (known after apply)
~ kube_admin_config = (sensitive value)
+ kube_admin_config_raw = (sensitive value)
~ kube_config = (sensitive value)
~ kube_config_raw = (sensitive value)
~ kubernetes_version = "1.26.6" -> (known after apply)
- local_account_disabled = false -> null
~ location = "eastus" # forces replacement -> (known after apply) # forces replacement
name = "aks-aks"
~ node_resource_group = "MC_testResourceGroup2_aks-aks_eastus" -> (known after apply)
~ node_resource_group_id = "/subscriptions/REDACTED/resourceGroups/MC_testResourceGroup2_aks-aks_eastus" -> (known after apply)
+ oidc_issuer_url = (known after apply)
- open_service_mesh_enabled = false -> null
~ portal_fqdn = "aks-r8glnvzu.portal.hcp.eastus.azmk8s.io" -> (known after apply)
+ private_dns_zone_id = (known after apply)
+ private_fqdn = (known after apply)
- tags = {} -> null
# (14 unchanged attributes hidden)
- auto_scaler_profile {
- balance_similar_node_groups = false -> null
- empty_bulk_delete_max = "10" -> null
- expander = "random" -> null
- max_graceful_termination_sec = "600" -> null
- max_node_provisioning_time = "15m" -> null
- max_unready_nodes = 3 -> null
- max_unready_percentage = 45 -> null
- new_pod_scale_up_delay = "0s" -> null
- scale_down_delay_after_add = "10m" -> null
- scale_down_delay_after_delete = "10s" -> null
- scale_down_delay_after_failure = "3m" -> null
- scale_down_unneeded = "10m" -> null
- scale_down_unready = "20m" -> null
- scale_down_utilization_threshold = "0.5" -> null
- scan_interval = "10s" -> null
- skip_nodes_with_local_storage = false -> null
- skip_nodes_with_system_pods = true -> null
}
~ default_node_pool {
- custom_ca_trust_enabled = false -> null
- fips_enabled = false -> null
~ kubelet_disk_type = "OS" -> (known after apply)
name = "system"
~ node_count = 1 -> (known after apply)
~ node_labels = {} -> (known after apply)
- node_taints = [] -> null
- only_critical_addons_enabled = false -> null
~ orchestrator_version = "1.26.6" -> (known after apply)
~ os_sku = "Ubuntu" -> (known after apply)
- tags = {} -> null
+ workload_runtime = (known after apply)
# (14 unchanged attributes hidden)
}
~ identity {
- identity_ids = [] -> null
~ principal_id = "53cd215f-ef23-4b11-98ab-2ff5dbd27ff7" -> (known after apply)
~ tenant_id = "72f988bf-86f1-41af-91ab-2d7cd011db47" -> (known after apply)
# (1 unchanged attribute hidden)
}
- kubelet_identity {
- client_id = "16265252-309a-461b-904b-5fcf3edebc89" -> null
- object_id = "850bb50d-e395-40c5-97be-4a6ea3913604" -> null
- user_assigned_identity_id = "/subscriptions/REDACTED/resourceGroups/MC_testResourceGroup2_aks-aks_eastus/providers/Microsoft.ManagedIdentity/userAssignedIdentities/aks-aks-agentpool" -> null
}
~ network_profile {
+ docker_bridge_cidr = (known after apply)
~ ip_versions = [
- "IPv4",
] -> (known after apply)
+ network_mode = (known after apply)
+ pod_cidr = (known after apply)
~ pod_cidrs = [] -> (known after apply)
~ service_cidrs = [
- "10.0.0.0/16",
] -> (known after apply)
# (6 unchanged attributes hidden)
- load_balancer_profile {
- effective_outbound_ips = [
- "/subscriptions/REDACTED/resourceGroups/MC_testResourceGroup2_aks-aks_eastus/providers/Microsoft.Network/publicIPAddresses/cd9fc956-b3a5-4dc9-9776-5006cab89cd6",
] -> null
- idle_timeout_in_minutes = 0 -> null
- managed_outbound_ip_count = 1 -> null
- managed_outbound_ipv6_count = 0 -> null
- outbound_ip_address_ids = [] -> null
- outbound_ip_prefix_ids = [] -> null
- outbound_ports_allocated = 0 -> null
}
}
~ oms_agent {
~ log_analytics_workspace_id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.OperationalInsights/workspaces/aks-workspace" -> (known after apply)
- msi_auth_for_monitoring_enabled = false -> null
~ oms_agent_identity = [
- {
- client_id = "1406d013-c601-47d9-ac40-7e30983f6349"
- object_id = "bd2dd411-b18a-4e1f-91bc-224c0880eb38"
- user_assigned_identity_id = "/subscriptions/REDACTED/resourcegroups/MC_testResourceGroup2_aks-aks_eastus/providers/Microsoft.ManagedIdentity/userAssignedIdentities/omsagent-aks-aks"
},
] -> (known after apply)
}
- windows_profile {
- admin_username = "azureuser" -> null
}
# (1 unchanged block hidden)
}
# module.aks.azurerm_log_analytics_solution.main[0] must be replaced
-/+ resource "azurerm_log_analytics_solution" "main" {
~ id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.OperationsManagement/solutions/ContainerInsights(aks-workspace)" -> (known after apply)
~ location = "eastus" # forces replacement -> (known after apply) # forces replacement
- tags = {} -> null
~ workspace_resource_id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.OperationalInsights/workspaces/aks-workspace" # forces replacement -> (known after apply) # forces replacement
# (3 unchanged attributes hidden)
~ plan {
~ name = "ContainerInsights(aks-workspace)" -> (known after apply)
# (2 unchanged attributes hidden)
}
}
# module.aks.azurerm_log_analytics_workspace.main[0] must be replaced
+/- resource "azurerm_log_analytics_workspace" "main" {
- cmk_for_query_forced = false -> null
~ id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.OperationalInsights/workspaces/aks-workspace" -> (known after apply)
~ location = "eastus" # forces replacement -> (known after apply) # forces replacement
name = "aks-workspace"
~ primary_shared_key = (sensitive value)
+ reservation_capacity_in_gb_per_day = (known after apply)
~ secondary_shared_key = (sensitive value)
- tags = {} -> null
~ workspace_id = "674fb72f-ff10-4eb2-8844-7b7190983d77" -> (known after apply)
# (8 unchanged attributes hidden)
}
# module.aks.azurerm_role_assignment.network_contributor_on_subnet["system"] must be replaced
-/+ resource "azurerm_role_assignment" "network_contributor_on_subnet" {
~ id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.Network/virtualNetworks/testResourceGroup2/subnets/system/providers/Microsoft.Authorization/roleAssignments/b9dc6b17-2a56-43d5-715f-ef4fa78d63c4" -> (known after apply)
~ name = "b9dc6b17-2a56-43d5-715f-ef4fa78d63c4" -> (known after apply)
~ principal_id = "53cd215f-ef23-4b11-98ab-2ff5dbd27ff7" # forces replacement -> (known after apply) # forces replacement
~ principal_type = "ServicePrincipal" -> (known after apply)
~ role_definition_id = "/subscriptions/REDACTED/providers/Microsoft.Authorization/roleDefinitions/4d97b98b-1d4f-4787-a291-c67834d212e7" -> (known after apply)
+ skip_service_principal_aad_check = (known after apply)
# (2 unchanged attributes hidden)
}
# module.network.azurerm_subnet.subnet_for_each["system"] will be updated in-place
~ resource "azurerm_subnet" "subnet_for_each" {
id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.Network/virtualNetworks/testResourceGroup2/subnets/system"
name = "system"
~ service_endpoints = [
- "Microsoft.Storage",
]
# (8 unchanged attributes hidden)
}
Plan: 4 to add, 1 to change, 4 to destroy.
@lonegunmanb I am not sure how to move forward with this.
The state drift is in a resource outside of the terraform-azurerm-aks
module, we just pass the ID of the subnet to the module.
Adding the service endpoint in terraform fixes the drift:
module "network" {
source = "Azure/network/azurerm"
vnet_name = azurerm_resource_group.example.name
resource_group_name = azurerm_resource_group.example.name
address_space = "10.52.0.0/16"
subnet_prefixes = ["10.52.0.0/16"]
subnet_names = ["system"]
use_for_each = true
depends_on = [azurerm_resource_group.example]
subnet_service_endpoints = {
"system" = ["Microsoft.Storage"]
}
}
However how can we enforce this when using storage_profile_enabled = true
?
Next steps:
location
explicit in the aks moduleI confirm that adding location
like this:
module "aks" {
source = "Azure/aks/azurerm"
version = "7.3.0"
resource_group_name = azurerm_resource_group.example.name
location = "eastus"
[..CUT..]
There is still state drift, but the impact is much smaller and does not involve the destroy and create of the cluster:
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply" which may have affected this plan:
# module.aks.azurerm_kubernetes_cluster.main has changed
~ resource "azurerm_kubernetes_cluster" "main" {
id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.ContainerService/managedClusters/aks-aks"
name = "aks-aks"
# (27 unchanged attributes hidden)
~ identity {
+ identity_ids = []
# (3 unchanged attributes hidden)
}
# (7 unchanged blocks hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to
undo or respond to these changes.
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
<= read (data resources)
Terraform will perform the following actions:
# module.aks.data.azurerm_resource_group.main will be read during apply
# (depends on a resource or a module with changes pending)
<= data "azurerm_resource_group" "main" {
+ id = (known after apply)
+ location = (known after apply)
+ managed_by = (known after apply)
+ name = "testResourceGroup2"
+ tags = (known after apply)
}
# module.network.azurerm_subnet.subnet_for_each["system"] will be updated in-place
~ resource "azurerm_subnet" "subnet_for_each" {
id = "/subscriptions/REDACTED/resourceGroups/testResourceGroup2/providers/Microsoft.Network/virtualNetworks/testResourceGroup2/subnets/system"
name = "system"
~ service_endpoints = [
- "Microsoft.Storage",
]
# (8 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Next steps:
- Test with
location
explicit in the aks module- Declare a datasource to retrieve the subnet information in case the user is bringing their own subnet. Create a precondition to check if the service endpoint is defined.
- Consider using check blocks: https://developer.hashicorp.com/terraform/language/checks
Thanks @zioproto for tracing this issue, a check
block would force the module's caller to upgrade their Terraform core version to >= 1.5.0
.
If we use check
block we can generate a warning, and I've checked the data.azurerm_subnet
's document, we can check the service_endpoints
.
Another approach is using a datasource block which contains a postcondition
block, that would block a subnet without proper service endpoints been used by this Aks module, but would that be an overkill?
I think using check
and generating a warning is okay. Because we should not force the service_endpoints
for customers that don't use persistent volumes.
Introduction
When
storage_profile_blob_driver_enabled
is True, the CSI driver running in the AKS cluster will create a Service Endpoint"Microsoft.Storage"
as soon as the firstPersistentVolumeClaim
is created. This change done by the CSI driver is not tracked in the Terraform state and causes stage drift. Because of the dependencies in the modules this state drift causes the destroy and creation of a new cluster.Is there an existing issue for this?
Greenfield/Brownfield provisioning
greenfield
Terraform Version
1.5.5
Module Version
7.3.0
AzureRM Provider Version
v3.68.0
Affected Resource(s)/Data Source(s)
azurerm_kubernetes_cluster
Terraform Configuration Files
tfvars variables values
Debug Output/Panic Output
Expected Behaviour
Terraform plan should show no changes
Actual Behaviour
Terraform will perform the following actions:
Steps to Reproduce
apply the following with
kubectl
:Once the Pod is running the Terraform state has drifted, run terraform again to confirm.
Important Factoids
No response
References
No response