terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources πŸ‡ΊπŸ‡¦
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws
Apache License 2.0
4.39k stars 4.04k forks source link

Karpenter is not authorized to perform: ec2:CreateTags on exiting WorkerNodes #3134

Open arseny-zinchenko opened 3 weeks ago

arseny-zinchenko commented 3 weeks ago

Description

I've upgraded the "terraform-aws-modules/eks/aws//modules/karpenter" from 20.0 to 20.24.0, and then upgraded the Helm chart version from 0.37.0 to 1.0.1.

After applying the Helm upgrade, Karpenter's logs constantly throwing the following error for the existing insatnces:

"error":"tagging nodeclaim, tagging instance, UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws:sts::***:assumed-role/KarpenterIRSA-atlas-eks-ops-1-30-cluster/1724411005093744967 is not authorized to perform: ec2:CreateTags on resource: arn:aws:ec2:us-east-1:***:instance/i-0d715a485281ffc45 because no identity-based policy allows the ec2:CreateTags action.

From the IAM Role, I can that it has the:

            "Action": "ec2:CreateTags",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/kubernetes.io/cluster/atlas-eks-ops-1-30-cluster": "owned",
                    "ec2:CreateAction": [
                        "RunInstances",
                        "CreateFleet",
                        "CreateLaunchTemplate"
                    ]
                },

If I'm understanding it correctly (and ChatGPT agreed :-) ), the ec2:CreateTags allowed only for new instances, but not for the exiting. And when I'm scaling a Deployment to create new NodeClaims, they are running without any errors from the Karpenter's logs.

Versions

Reproduction Code [Required]

The code to deploy Karpenter's module is:

module "karpenter" {
  source = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "20.24.0"

  cluster_name = module.eks.cluster_name

  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]

  create_node_iam_role = false

  node_iam_role_arn         = module.eks.eks_managed_node_groups["${local.env_name_short}-default"].iam_role_arn

  enable_irsa             = true
  create_instance_profile = true

  # backward compatibility with 19.21.0
  # see https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-20.0.md#karpenter-diff-of-before-v1921-vs-after-v200
  iam_role_name          = "KarpenterIRSA-${module.eks.cluster_name}"
  iam_role_description   = "Karpenter IAM role for service account"
  iam_policy_name        = "KarpenterIRSA-${module.eks.cluster_name}"
  iam_policy_description = "Karpenter IAM role for service account"

  iam_role_use_name_prefix = false

  # already created during EKS 19 > 20 upgrade with 'authentication_mode = "API_AND_CONFIG_MAP"'
  create_access_entry = false
}
bryantbiggs commented 3 weeks ago

I think this is more of a question for Karpenter or perhaps its called out in the Karpenter upgrade guide for v1.0 - we are simply matching the policy that has been provided by the project

fraenkel commented 3 weeks ago

Did you set the new var, enable_v1_permissions?