hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.6k stars 4.64k forks source link

Destroy AKS oms_agent block before log_analytics_workspace_id. #9334

Open phivid opened 3 years ago

phivid commented 3 years ago

Community Note

Terraform (and AzureRM Provider) Version

Terraform v0.13.4

Affected Resource(s)

Terraform Configuration Files

# We want to remove this resource and associated resources.
#resource "azurerm_log_analytics_workspace" "aks" {
#  # The WorkSpace name has to be unique across the whole of azure, not just the current subscription/tenant.
#  name                = "log-${local.cluster_name}-${random_id.log_analytics_workspace_name_suffix.dec}"
#  location            = data.azurerm_resource_group.aks_cluster_rg.location
#  resource_group_name = data.azurerm_resource_group.aks_cluster_rg.name
#  sku                 = "PerGB2018"
#  retention_in_days   = 30
#}

resource "azurerm_kubernetes_cluster" "aks_cluster" {
  name                = local.cluster_name
  location            = data.azurerm_resource_group.aks_cluster_rg.location
  resource_group_name = data.azurerm_resource_group.aks_cluster_rg.name
  dns_prefix          = local.cluster_name
  kubernetes_version  = local.master_kubernetes_version

  role_based_access_control {
    enabled = false
  }

  default_node_pool {
    name       = "default"
    node_count = local.vm_count
    vm_size    = local.vm_size
    orchestrator_version = local.pool_kubernetes_version
  }

  service_principal {
    client_id     = local.client_id
    client_secret = local.client_secret
  }

  linux_profile {
    admin_username = local.vmuser

    ssh_key {
      key_data = file(local.public_ssh_key_path)
    }
  }

  network_profile {
    network_plugin    = "kubenet"
    load_balancer_sku = local.loadbalancerSku

    load_balancer_profile {
      outbound_ip_address_ids = [local.public_ip_id]
      idle_timeout_in_minutes = 30
    }
  }

  addon_profile {
    kube_dashboard {
      enabled = true
    }

    # This block must be remove first.
    #oms_agent {
    #  enabled                    = true
    #  log_analytics_workspace_id = azurerm_log_analytics_workspace.aks.id
    #}
  }

Debug Output

None to provide.

Panic Output

https://gist.github.com/philippevidal80/a810aead039c9205d4aec13ebf33ee0b

Expected Behaviour

Delete oms_agent block before Log Analytic Workspace.

Actual Behaviour

Log Analytic Workspace is remove before AKS oms_agent block and produce error.

Steps to Reproduce

  1. terraform apply

Important Factoids

Resources are deployed in francecentral

References

dglynn commented 3 years ago

We are seeing this behaviour too, there doesn't seem to be an implicit dependency check between removing oms_agent config from a k8s cluster resource and the log_analytics_workspace & _solution resources at the same time. I had this happen to me about 2 days ago 11.1.2021. I had to recreate the log analytics resources before I could then eventually remove the oms_agent settings in our clusters.

I hit another issue whereby I couldn't remove the oms_agent code from the k8s module resource completely as the OP showed above.

I tried to remove the oms_agent block from our k8s module as a first step and applied this change it was successful. I then applied the removing of the log_analytics resources in our code and it failed again because of the dependency. The provider may have a bug in that it cannot change the existing clusters setting for the oms_agent from enabled: true/false -> null. If your cluster has had this setting on previously, the only way for a plan to apply is the leave the value in the addon_profile block as below:

oms_agent {
enabled = false
}

In the docs it clearly marks this setting as optional: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster#oms_agent Reading further down the page about the oms_agent, you can see enabled - (Required) Is the OMS Agent Enabled?: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster#enabled

Does this mean once it has been set in your cluster it can never be removed, i.e. set to null via terraform using the provider? I don't have any output to show but an easy test is to set up a k8s cluster with the oms_agent set to true/false with the log_analytics resources. After that comment out or remove the oms_agent block run the plan&apply, terraform reports a successful apply. Run another plan and it should show it will set the oms_agent value from enabled: true/false - > null again.

tombuildsstuff commented 3 years ago

👋

My understanding is that this bug in Terraform Core should be fixed in v0.14.x - would you mind upgrading to the latest release (v0.14.4 at the time of writing) and let us know if your still seeing this?

Thanks!

dglynn commented 3 years ago

Hi @tombuildsstuff thanks for the quick reply on this.

We are currently on v0.12.29 atm and have 2 tickets in our backlog for an upgrade to v0.13.x and then v0.14.x to be done for our tf code. These should hopefully happen in the next couple of months. I guess we are ok for now as I have now removed all the log_analytics resources and set that value to false in all our clusters. Could you link to the bug thanks?

maurelio1234 commented 3 years ago

Hello,

I just want to add that we have the same problem. We are in terraform 0.14.3. I tried first removing the oms_agent block then the azurerm_log_analytics_workspace, and setting it to false then destroying the worksapce, but somehow, no matte what I do the workspace seems to persist on azure.

Besides, I cannot not set log_analytics_workspace_id on the cluster, as it is mandatory and it checks that the linked id actually exists on azure, even if oms_agent is disabled.

For now, the only thing that "worked" for us, was keeping the config as is (and thus keeping the analytics workspace) and setting enabled = false.

Our providers:

    azurerm = {
      source = "hashicorp/azurerm"
      version = "1.35.0"
    }
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = "1.13.3"
    }
    null = {
      source = "hashicorp/null"
      version = "3.1.0"
    }
    helm = {
      source = "hashicorp/helm"
      version = "1.3.2"
    }
    random = {
      source = "hashicorp/random"
      version = "3.1.0"
    }
  }