terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources πŸ‡ΊπŸ‡¦
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws
Apache License 2.0
4.39k stars 4.04k forks source link

Karpenter pod can't use policies created by karpenter module when we have eks and vpc with aws_eks_cluster module #3146

Open sadath-12 opened 1 week ago

sadath-12 commented 1 week ago

Description

I am trying to integrate karpenter module into my existing aws_eks_cluster module but the cluster policies and roles karpenter module creates are not getting attached to my existing cluster and the pod fails giving issues such sqs permission , cant list images .. etc . I see those roles are created in my console but not used up by pods , attaching a admin policy to worker node makes it work .

So the question is how can I tell karpenter module to attach those roles and polcies to my existing cluster since I cant find such an example

βœ‹ I have searched the open/closed issues and my issue is not listed.

Versions

Reproduction Code [Required]

Steps to reproduce the behavior:

Create the cluster following the steps


resource "aws_eks_cluster" "eks_cluster" {
  name     = var.cluster_name
  role_arn = aws_iam_role.eks_master_role.arn
  version  = var.kubernetes_version

  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]

   access_config {
    authentication_mode = "API_AND_CONFIG_MAP"
    bootstrap_cluster_creator_admin_permissions = true
  }

  vpc_config {
    subnet_ids = [
      var.subnet_1a,
      var.subnet_1b
    ]
    endpoint_private_access = true
    endpoint_public_access  = true
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_cluster_cluster,
    aws_iam_role_policy_attachment.eks_cluster_service
  ]

  tags ={
  "karpenter.sh/discovery" = var.cluster_name
  }  

}

resource "aws_eks_node_group" "eks_node_group" {
  cluster_name    = var.cluster_name
  node_group_name = format("%s-node-group", var.cluster_name)
  node_role_arn   = aws_iam_role.eks_node_role.arn

  subnet_ids = [
    var.subnet_2a,
    var.subnet_2b
  ]

  scaling_config {
    desired_size = var.desired_size
    max_size     = var.max_size
    min_size     = var.min_size
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.eks_AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.eks_AmazonEC2ContainerRegistryReadOnly
  ]

  tags = var.tags
}

Then pass the cluster name to karpenter module


module "karpenter" {
  source = "terraform-aws-modules/eks/aws//modules/karpenter"

  cluster_name = var.cluster_name

  enable_v1_permissions = true

  enable_pod_identity             = true
  create_pod_identity_association = true

  enable_irsa = true

  # Attach additional IAM policies to the Karpenter node IAM role
  node_iam_role_additional_policies = {
    AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
  }

  tags = {
    Environment = "dev"
    Terraform   = "true"
  }
}

Expected behavior

Karpenter pod up and running + readiness of nodepools and ec2nodeclass

Actual behavior

karpenter pod crashes and gives permissions related errors

Terminal Output Screenshot(s)

Additional context

ockhamlabs commented 1 week ago

We are facing same issue as well. Would be great to get this addressed

bryantbiggs commented 1 week ago

this doesn't seem to be a question about this module, nor does it contain a reproduction. I would suggest looking at our Karpenter example to compare and contrast with what you are trying to create

sadath-12 commented 1 week ago

@bryantbiggs , the example works for me . Is there any key parameter that has to be passed so karpenter controller pod assume the roles and policies in my case ? . since I already tried comparing and cant find solution

bryantbiggs commented 1 week ago

I mean there are a number of factors that need to be considered - the best bet is to compare what we have provided since its a full working solution, all the way down to the Karpenter nodepool and nodeclass

sadath-12 commented 1 week ago

@bryantbiggs the terraform code is quite complex to me . I am able to run the terraform-aws-eks/examples/karpenter/main.tf and it works as expected .. and same terraform module configuration I used with my existing clusters module . I did compare few simple stuffs such as providing tags for subnets and nodegroups which I did . Later I am not sure what to do . To make it work the only workaround for me now is to attach admin policy to nodegroup . So I am assuming something related to pod-identity there is issue

bryantbiggs commented 1 week ago

we don't provide guidance on custom implementations, we can only provide support and guidance on what we provide. I'm not sure why you are creating your own custom implementation if its too complex for you all to handle, but perhaps just using the EKS module that we provide would better suit you since it does work (as you have pointed out)

sadath-12 commented 1 week ago

Understood @bryantbiggs . I would inspect the implementation you have , any pointers on what specific things to look at for the case where pod is not getting permissions ?