hashicorp / terraform-provider-kubernetes

Terraform Kubernetes provider
https://www.terraform.io/docs/providers/kubernetes/
Mozilla Public License 2.0
1.58k stars 968 forks source link

Deleted EKS cluster results in service account 403 and terraform errors #2036

Closed alexandrujieanu closed 2 months ago

alexandrujieanu commented 1 year ago

Terraform Version, Provider Version and Kubernetes Version

Terraform version: 1.3.3
Kubernetes provider version: 2.18.1
Kubernetes version: 1.24

Affected Resource(s)

Deleted AWS EKS cluster

Terraform Configuration Files

eks.tf


module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.10.0"

  cluster_name                    = "${local.env}-${local.region}-eks-cluster"
  cluster_version                 = "1.24"
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = false

  cluster_addons = {
    coredns = {
      most_recent = true
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {
      most_recent = true
      resolve_conflicts = "OVERWRITE"
    }
    vpc-cni = {
      most_recent = true
      resolve_conflicts = "OVERWRITE"
    }
  }

  create_kms_key = false
  cluster_encryption_config = []

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  cluster_additional_security_group_ids = [aws_security_group.vpn_https_access.id]

  eks_managed_node_group_defaults = {
    ami_type               = "AL2_x86_64"
    disk_size              = 10
    instance_types         = local.eks_instance_types
    capacity_type          = local.eks_capacity_type
    vpc_security_group_ids = [aws_security_group.vpn_ssh_access.id]
  }

  eks_managed_node_groups = {
    blue = {
      min_size     = 1
      max_size     = 6
      desired_size = 3
      # Remote access cannot be specified with a launch template
      use_custom_launch_template = false

      remote_access = {
        ec2_ssh_key               = aws_key_pair.KEY-NAME-HERE.key_name
        source_security_group_ids = [aws_security_group.vpn_ssh_access.id]
      }

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 10
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 150
            encrypted             = false
            delete_on_termination = true
          }
        }
      }
    }
    green = {
      min_size     = 0
      max_size     = 6
      desired_size = 0
    }
  }

  cluster_security_group_additional_rules = {
    egress_nodes_ephemeral_ports_tcp = {
      description                = "To node 1025-65535"
      protocol                   = "tcp"
      from_port                  = 1025
      to_port                    = 65535
      type                       = "egress"
      source_node_security_group = true
    }
  }

  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  manage_aws_auth_configmap = true

  aws_auth_roles = [
    {
      rolearn  = local.sso_role_arn
      username = "cluster-admin"
      groups   = ["system:masters"]
    },
  ]

}

providers.tf

data "aws_caller_identity" "current" {}

provider "aws" {
  region = "us-east-1"

  assume_role {
    role_arn     = "arn:aws:iam::ACCOUNT-ID-HERE:role/ROLE-NAME-HERE"
  }

  default_tags {
    tags = {
      Environment = local.env
      Region      = local.region
    }
  }
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name, "--role-arn", "arn:aws:iam::ACCOUNT-ID-HERE:role/ROLE-NAME-HERE"]
  }

}

locals.tf

sso_role_arn       = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/AWSReservedSSO_AdministratorAccess_ACCESS-STRING-HERE"

Debug Output

https://gist.github.com/alexandrujieanu/317f94b035d1d6a47ae463211da00e05

Panic Output

n/a

Steps to Reproduce

  1. Provision EKS cluster using terraform-aws-modules v19.10.0.
  2. Delete the EKS cluster from the AWS Console.
  3. Run terraform plan/apply/destroy.

Expected Behavior

Terraform should have detected that the EKS cluster was deleted and plan to recreated.

Actual Behavior

Terraform plan/apply/destroy errors.

╷
│ Error: configmaps "aws-auth" is forbidden: User "system:serviceaccount:SERVICE-ACCOUNT-NAME-HERE:default" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
│ 
│   with module.eks.kubernetes_config_map_v1_data.aws_auth[0],
│   on .terraform/modules/eks/main.tf line 550, in resource "kubernetes_config_map_v1_data" "aws_auth":
│  550: resource "kubernetes_config_map_v1_data" "aws_auth" {
│ 
╵

Important Factoids

n/a

References

n/a

Community Note

alexandrujieanu commented 1 year ago

This helped, but I think the provider should not error like that.

terraform state rm module.eks.kubernetes_config_map_v1_data.aws_auth[0]

Removed module.eks.kubernetes_config_map_v1_data.aws_auth[0]
Successfully removed 1 resource instance(s).
cristianpirtea commented 1 year ago

I am facing the same situation when i trigger my pipeline, in order to destroy the kubernetes objects from EKS cluster:

Error: Kubernetes Client kubernetes client initialization failed: the server has asked for the client to provide credentials. Error: configmaps "aws-auth" is forbidden: User "system:serviceaccount:default:default" cannot delete resource "configmaps" in API group "" in the namespace "kube-system"

The strange behaviour is that when i`m creating or updating the kubernetes objects, the kubernetes provider does not complain that can not authenticate to EKS I did the same workaround as you @alexandrujieanu, but it is ugly to delete all the k8s objects from terraform state

github-actions[bot] commented 3 months ago

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!