terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws
Apache License 2.0
4.36k stars 4.03k forks source link

Karpenter controller error: message:ec2 api connectivity check failed,error:NoCredentialProviders: no valid providers in chain. #3085

Closed asark67 closed 1 week ago

asark67 commented 2 months ago

Description

When creating a cluster with karpenter the controller gives the above error. The VPC/Subnets are created separately and we are using private subnets only.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

Reproduction Code [Required]

Service account exists:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.14.0"

  cluster_name    = "test-dev-cluster"
  cluster_version = "1.30"

  # Gives Terraform identity admin access to cluster which will
  # allow deploying resources (Karpenter) into the cluster
  enable_cluster_creator_admin_permissions = true
  cluster_endpoint_public_access           = true

  #disable OIDC provider creation
  enable_irsa = false

  iam_role_permissions_boundary = var.permissions_boundary_arn

  cluster_addons = {
    coredns                = {}
    eks-pod-identity-agent = {}
    kube-proxy             = {}
    vpc-cni                = {}
  }

  vpc_id                   = data.aws_vpc.selected.id
  control_plane_subnet_ids = data.aws_subnets.app.ids

  cluster_additional_security_group_ids = data.aws_security_groups.cluster_tiers.ids

  eks_managed_node_groups = {
    karpenter = {
      ami_type       = "AL2023_x86_64_STANDARD"
      instance_types = ["t2.large", "t3.large", "m4.large", "m5.large", "m6i.large"]
      min_size       = 1
      max_size       = 5
      desired_size   = 1
      taints         = {
        # This Taint aims to keep just EKS Addons and Karpenter running on this MNG
        # The pods that do not tolerate this taint should run on nodes created by Karpenter
        addons = {
          key    = "CriticalAddonsOnly"
          value  = "true"
          effect = "NO_SCHEDULE"
        }
      }

      subnet_ids             = data.aws_subnets.app.ids
      vpc_security_group_ids = data.aws_security_groups.app_tier.ids
    }
  }

  iam_role_additional_policies = {
    AmazonSSMManagedInstanceCore             = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore",
    AmazonSSMManagedEC2InstanceDefaultPolicy = "arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy"
  }

  tags = merge(var.tags, {
    # NOTE - if creating multiple security groups with this module, only tag the
    # security group that Karpenter should utilize with the following tag
    # (i.e. - at most, only one security group should have this tag in your account)
    "karpenter.sh/discovery" = "test-dev-cluster"
  })
}

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "20.14.0"

  cluster_name = "test-dev-cluster"

  enable_pod_identity             = true
  create_pod_identity_association = true

  # Used to attach additional IAM policies to the Karpenter node IAM role
  node_iam_role_additional_policies = {
    AmazonSSMManagedInstanceCore             = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore",
    AmazonSSMManagedEC2InstanceDefaultPolicy = "arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy"
  }

  tags = var.tags
}

Steps to reproduce the behaviour:

Points to note:

  1. The VPC/Subnets are created separately and tagged - there is no public subnet. Tags used are: "kubernetes.io/role/internal-elb" = 1 "karpenter.sh/discovery" = "test-dev-cluster"
  2. enable_irsa is set to false as we have no permission to create OIDC provider

Expected behavior

EKS Cluster & Karpenter controller should come up cleanly

Actual behavior

EKS Cluster comes up cleanly Karpenter controller fails with error: {"level":"ERROR","time":"2024-06-30T11:45:00.909Z","logger":"controller","message":"ec2 api connectivity check failed","commit":"490ef94","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}

Terminal Output Screenshot(s)

Additional context

Running pods:-

NAMESPACE     NAME                           READY   STATUS    RESTARTS         AGE
kube-system   aws-node-s5qnr                 2/2     Running   0                26h
kube-system   coredns-c7bbdfbb8-tsf8s        1/1     Running   0                27h
kube-system   coredns-c7bbdfbb8-xnmmz        1/1     Running   0                27h
kube-system   eks-pod-identity-agent-z5src   1/1     Running   0                26h
kube-system   karpenter-5588887cb-428tz      0/1     Running   354 (107s ago)   27h
kube-system   karpenter-5588887cb-g4dml      0/1     Pending   0                27h
kube-system   kube-proxy-zn7wt               1/1     Running   0                26h

ServiceAccount exists:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    meta.helm.sh/release-name: karpenter
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2024-06-30T08:54:00Z"
  labels:
    app.kubernetes.io/instance: karpenter
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: karpenter
    app.kubernetes.io/version: 0.37.0
    helm.sh/chart: karpenter-0.37.0
  name: karpenter
  namespace: kube-system
  resourceVersion: "1065"
  uid: f0fd9105-319b-4608-8d56-5c4efb11808f
jvidalg commented 1 month ago

I had the same issue, I was able to fix it by enabling the eks-pod-identity-agent in the eks cluster:

eks_addons = {
    coredns = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }
    kube-proxy = {
      most_recent = true
    }
    eks-pod-identity-agent = {
      most_recent = true
    }
  }

Although I see this thread configuration already has it, it may be worth checking if the pod identity association is using the correct role, you can list and check each:

aws eks list-pod-identity-associations --region <region> --cluster-name <cluster-name> aws eks describe-pod-identity-association --cluster-name <cluster-name> --association-id <associationid>

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

github-actions[bot] commented 1 week ago

This issue was automatically closed because of stale in 10 days