aws-samples / cdk-eks-karpenter

CDK construct for installing and configuring Karpenter on EKS clusters
Apache License 2.0
34 stars 14 forks source link

Missing permission to create tags on spot-instances-request resources #150

Closed jonathanbeber closed 6 months ago

jonathanbeber commented 6 months ago

The project is missing the ec2:CreateTags permission in here. Without it, deploying Karpente v0.33.0 results in an error when trying to launch spot instances:

{
    "level": "ERROR",
    "time": "2023-12-18T21:32:27.781Z",
    "logger": "controller",
    "message": "Reconciler error",
    "commit": "2dd7fdc",
    "controller": "nodeclaim.lifecycle",
    "controllerGroup": "karpenter.sh",
    "controllerKind": "NodeClaim",
    "NodeClaim": {
        "name": "nodepool-bfmb6"
    },
    "namespace": "",
    "name": "nodepool-bfmb6",
    "reconcileID": "e9ccbe9d-9393-4c8d-8734-b8bbd1bd398f",
    "error": "launching nodeclaim, creating instance, with fleet error(s), UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws:sts::AWS_ACCOUNT_ID:assumed-role/karpenterRole92C02710-LdrlCLZ4hvsL is not authorized to perform: ec2:CreateTags on resource: arn:aws:ec2:us-west-2:AWS_ACCOUNT_ID:spot-instances-request/* because no identity-based policy allows the ec2:CreateTags action."
}

I could solve it by manually adding the ec2:CreateTags in line 481.

Seems similar to https://github.com/aws/karpenter-provider-aws/issues/5270 and I don't fully get what was the solution in there, since it seems like @jls-appfire had to patch the role manually.

andskli commented 6 months ago

@jonathanbeber thanks for raising this. Similar to the issue linked I based the IAM policy on the CloudFormation example from the Karpenter docs. Looking at https://github.com/aws/karpenter-provider-aws/pull/5290 it seems like we need to add a few conditional actions. I will try to get this sorted shortly.

jls-appfire commented 6 months ago

@jonathanbeber in that issue that I raised, I found that there are two copies, for version 0.33.0, of the IAM policy - one is a standalone file as part of the v1beta update process.

However the policy is also provided in the CloudFormation code. When I compared the standalone 0.33.0 policy with the policy in the 0.33.0 version of the CloudFormation code, they are not the same.

I don't use CloudFormation, I use Terraform. So the fix on my part was to take the policy from CloudFormation (and drop the standalone file) and convert it to Terraform and ignore that standalone file I found.

In the https://github.com/aws/karpenter-provider-aws/issues/5270 I tried to call out that having the policy in two places, with different content, but with the same version number is detrimental, however I'm not sure that I was able to effect any change there.

Hopefully this helps.