aliyun / terraform-provider-alicloud

Terraform AliCloud provider
https://www.terraform.io/docs/providers/alicloud/
Mozilla Public License 2.0
589 stars 554 forks source link

ACK Cluster failed to destroy when there is more than one nodepool #4923

Open Axory opened 2 years ago

Axory commented 2 years ago

Terraform Version

Terraform v1.1.9 on darwin_arm64

Affected Resource(s)

Please list the resources as a list, for example:

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

resource "alicloud_vpc" "vpc" {
    cidr_block          = var.vpc_cidr
    vpc_name            = var.vpc_name
    resource_group_id   = data.alicloud_resource_manager_resource_groups.sre.groups.0.id
}

resource "alicloud_vswitch" "zone_a" {
  vpc_id        = resource.alicloud_vpc.vpc.id
  cidr_block    = var.vswitch_malaysia_zone_a_cidr
  zone_id       = var.vswitch_malaysia_zone_a_id
  vswitch_name  = "zone-A-vSwitch"
}

resource "alicloud_cs_managed_kubernetes" "k8s" {
    name                = var.kubernetes_master_name
    version             = var.kubernetes_master_version
    pod_cidr            = var.kubernetes_master_pod_cidr
    service_cidr        = var.kubernetes_master_service_cidr
    proxy_mode          = var.kubernetes_master_proxy_mode
    cluster_spec        = var.kubernetes_master_cluster_spec
    resource_group_id   = data.alicloud_resource_manager_resource_groups.sre.groups.0.id

    worker_vswitch_ids = [resource.alicloud_vswitch.zone_a.id]

    runtime = {
        name    = var.kubernetes_master_runtime_name
        version = var.kubernetes_master_runetime_version
    }

    kube_config         = "kubeconfig"
}

resource "alicloud_key_pair" "k8s_worker_key_pair" {
    key_pair_name = var.kubernete_node_key_pair_name
}

resource "alicloud_cs_kubernetes_node_pool" "monitoring" {
    name                  = var.kubernetes_node_monitoring_name
    resource_group_id     = data.alicloud_resource_manager_resource_groups.sre.groups.0.id
    cluster_id            = resource.alicloud_cs_managed_kubernetes.k8s.id
    key_name              = resource.alicloud_key_pair.k8s_worker_key_pair.key_name
    vswitch_ids           = [
                                resource.alicloud_vswitch.zone_a.id
                            ]
    instance_types        = var.kubernetes_node_monitoring_instance
    system_disk_category  = var.kubernetes_node_monitoring_disk_type
    system_disk_size      = var.kubernetes_node_monitoring_disk_size
    desired_size          = var.kubernetes_node_monitoring_size
    install_cloud_monitor = var.kubernetes_node_monitoring_install_cloud_monitor

    labels {
        key     = "node.type"
        value   = "monitoring"
    }
}

resource "alicloud_cs_kubernetes_node_pool" "frontend" {
    name                  = var.kubernetes_node_frontend_name
    resource_group_id     = data.alicloud_resource_manager_resource_groups.sre.groups.0.id
    cluster_id            = resource.alicloud_cs_managed_kubernetes.k8s.id
    key_name              = resource.alicloud_key_pair.k8s_worker_key_pair.key_name
    vswitch_ids           = [
                                resource.alicloud_vswitch.zone_a.id
                            ]
    instance_types        = var.kubernetes_node_frontend_instance
    system_disk_category  = var.kubernetes_node_frontend_disk_type
    system_disk_size      = var.kubernetes_node_frontend_disk_size
    desired_size          = var.kubernetes_node_frontend_size
    install_cloud_monitor = var.kubernetes_node_frontend_install_cloud_monitor

    labels {
        key     = "node.type"
        value   = "frontend"
    }

    taints {
        key     = "node.type"
        value   = "frontend:NoSchedule"
    }
}

resource "alicloud_cs_kubernetes_node_pool" "database" {
    name                  = var.kubernetes_node_database_name
    resource_group_id     = data.alicloud_resource_manager_resource_groups.sre.groups.0.id
    cluster_id            = resource.alicloud_cs_managed_kubernetes.k8s.id
    key_name              = resource.alicloud_key_pair.k8s_worker_key_pair.key_name
    vswitch_ids           = [
                                resource.alicloud_vswitch.zone_a.id
                            ]
    instance_types        = var.kubernetes_node_database_instance
    system_disk_category  = var.kubernetes_node_database_disk_type
    system_disk_size      = var.kubernetes_node_database_disk_size
    desired_size          = var.kubernetes_node_database_size
    install_cloud_monitor = var.kubernetes_node_database_install_cloud_monitor

    labels {
        key     = "node.type"
        value   = "database"
    }

    taints {
        key     = "node.type"
        value   = "database:NoSchedule"
    }
}

Expected Behavior

I'm not sure if I need to do additional steps like kubectl drain nodes + kubectl delete pvc --all in order to destroy the whole cluster completely in one try.

Actual Behavior

If you create more than 1 nodepool (non-managed), then you try to terraform destroy, it will only destroy one of the node pool. I have 3 nodepool in the ACK cluster. The other 2 nodepool return this error.

image

Then u run terraform destroy again, it will destroy one of the nodepool.

So I have to run terraform destroy 3 times in oder to completely destroy everything

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply
  2. Deploy some helm charts
  3. terraform destroy
  4. Error returned
  5. Run terraform destroy again
  6. Error returned
  7. Terraform destory
Pangjiping commented 2 years ago

please scale the number of nodes in the created nodepool to zero, then destroy cluster.