aws-samples / karpenter-blueprints

Karpenter Blueprints is a list of common workload scenarios following best practices. You'll find here details of why configuring the Karpenter and Kubernetes objects in such a way is important when using Karpenter on EKS.
MIT No Attribution
225 stars 37 forks source link

Can't create EKS cluster - InvalidParameterException #20

Open c13 opened 1 week ago

c13 commented 1 week ago

I'm trying to create a cluster in cluster/terraform folder, but the command "terraform apply" gives me the error

module.eks.aws_eks_cluster.this[0]: Still creating... [1m40s elapsed] module.eks.aws_eks_cluster.this[0]: Still creating... [1m50s elapsed] Error: creating EKS Cluster (karpenter-blueprints): operation error EKS: CreateCluster, https response error StatusCode: 400, RequestID: xxx, InvalidParameterException: Role with arn: arn:aws:iam::xxx:role/karpenter-blueprints-cluster-20241020164557819100000001, could not be assumed because it does not exist or the trusted entity is not correct

Terraform v1.9.3 on darwin_arm64

aws-cli/2.18.10 Python/3.12.7 Darwin/23.6.0 source/arm64

I haven't changed any module version Kubernetes | 1.30 Karpenter | v1.0.1 Terraform | 1.9.3 AWS EKS | v20.23.0 EKS Blueprints Addons | v1.16.3

jakeskyaws commented 1 week ago

Hi @c13. I pulled the latest main branch and wasn’t able to replicate the issue you're facing.

It looks like an issue related to the IAM role that the EKS cluster is trying to assume.

We are using the eks module to build the cluster. Is there a possibility that there was a terraform apply error prior to running the terraform apply for the blueprint itself ?

Could we check the following:

IAM Role Exists:

aws iam get-role --role-name karpenter-blueprints-cluster-20241020164557819100000001

Verify the Trust Relationship: The EKS service must be trusted by this IAM role. You can check the trust relationship in the IAM role's configuration. It should look something like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
c13 commented 1 week ago

Hello Jake,

Thank you for your answer!

I have destroyed and recreated the cluster. Error says "Error: creating EKS Cluster (karpenter-blueprints): operation error EKS: CreateCluster, https response error StatusCode: 400, RequestID: xxx, InvalidParameterException: Role with arn: arn:aws:iam::xxx:role/karpenter-blueprints-cluster-20241022103201571200000002, could not be assumed because it does not exist or the trusted entity is not correct"

Role karpenter-blueprints-cluster-20241022103201571200000002 exists

There are two policies attached and a customer managed policy

Trust relation exists

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EKSClusterAssumeRole",
            "Effect": "Allow",
            "Principal": {
                "Service": "eks.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
jakeskyaws commented 1 week ago

This all seems correct. Are there any valuable insights in Cloudtrail to help us identify the issues with assuming the role ?