hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.82k stars 9.17k forks source link

[Bug]: aws_eks_addon cannot remove value for service_account_role_arn #30645

Open jgoldschrafe opened 1 year ago

jgoldschrafe commented 1 year ago

Terraform Core Version

1.3.8

AWS Provider Version

4.61.0

Affected Resource(s)

Expected Behavior

The EKS addon should be reconfigured to use the node's credentials instead of the previously-configured IRSA role ARN.

Actual Behavior

The provider emits a fatal validation error that appears to originate from the AWS SDK for Go.

Relevant Error/Panic Output Snippet

╷
│ Error: error updating EKS Add-On (my-cluster:vpc-cni): InvalidParameter: 1 validation error(s) found.
│ - minimum field size of 1, UpdateAddonInput.ServiceAccountRoleArn.
│ 
│ 
│   with module.eks.module.eks.aws_eks_addon.this["vpc-cni"],
│   on .terraform/modules/eks.eks/main.tf line 382, in resource "aws_eks_addon" "this":
│  382: resource "aws_eks_addon" "this" {
│ 
╵

Terraform Configuration Files

resource "aws_eks_addon" "vpc_cni" {
  cluster_name      = aws_eks_cluster.cluster.name
  addon_name        = "vpc-cni"
  addon_version     = "v1.12.6-eksbuild.1"
  resolve_conflicts = "PRESERVE"
}

Steps to Reproduce

  1. Apply the above Terraform against an EKS cluster.
  2. Assign a service account to the addon configuration using something like the following: aws eks update-addon --cluster-name my-cluster --addon-name vpc-cni --addon-version v1.12.6-eksbuild.1 --service-account-role-arn arn:aws:iam::111122223333:role/MyVPCCNIRole --resolve-conflicts PRESERVE
  3. Attempt to apply the above Terraform again to remove the service account configuration.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

mattburgess commented 1 year ago

@jgoldschrafe - I took a look at fixing this. The good news is I can reproduce the problem in a test case so it doesn't actually matter if the IRSA is added behind Terraform's back or not.

The bad news is that I haven't yet figured out a way to remove the IRSA from the add-on. It looks like you can't even do it from the AWS console; once I add an IRSA then the "Edit" screen for the addon will allow me to change the role to another IAM-provided one but I can't actually ask it to revert back to a none-IAM role.

If you've found a way to do this in the console or via the aws or eksctl CLI tools then I might be able to map that to the underlying API calls we need to make. But I suspect you might have to take this issue up with AWS directly to see if this is, or can be, supported. Raising it at https://github.com/aws/containers-roadmap/issues might be a good first step; I had a quick search through the existing issues, both open and closed, and couldn't immediately see anything relevant.

scott2449 commented 1 year ago

This happens for us even when using service_account_role_arn.. the first time we add/change service_account_role_arn to the resource it applies the annotation. Then if you run it again it will remove the arn annotation from the service account.

venkatamutyala commented 6 months ago

Having this issue with coredns. we mistakenly added a service_account_role_arn to our coredns add_on resource and now we cannot remove it.

dfroberg commented 3 weeks ago

Noticed that if you assign a role that is not really a valid IRSA role to the service_account_role_arn using TF it's now possible to re-save the addon in the console and it will have IRSA role: Not Set. This came up in the migration to EKS Pod Identity and the role names was reused.

Still can't un-set the service_account_role_arn as it explodes with:

╷
│ Error: updating EKS Add-On (reliability-dev:aws-ebs-csi-driver): operation error EKS: UpdateAddon, https response error StatusCode: 403, RequestID: 5bfc50dc-8c8d-47f2-818d-77ac95da2045, api error AccessDeniedException: Cross-account pass role is not allowed.
│ 
│   with module.terraform_aws_eks_module.module.eks.aws_eks_addon.this["aws-ebs-csi-driver"],
│   on .terraform/modules/terraform_aws_eks_module.eks/main.tf line 484, in resource "aws_eks_addon" "this":
│  484: resource "aws_eks_addon" "this" {
│ 
╵
╷
│ Error: updating EKS Add-On (reliability-dev:vpc-cni): operation error EKS: UpdateAddon, https response error StatusCode: 403, RequestID: 086571c4-7d94-4dcd-8f1a-5e942863807f, api error AccessDeniedException: Cross-account pass role is not allowed.
│ 
│   with module.terraform_aws_eks_module.module.eks.aws_eks_addon.before_compute["vpc-cni"],
│   on .terraform/modules/terraform_aws_eks_module.eks/main.tf line 513, in resource "aws_eks_addon" "before_compute":
│  513: resource "aws_eks_addon" "before_compute" {

But after re-save the error goes away, the pass role error is bogus.

dfroberg commented 3 weeks ago

Some more information for @mattburgess

The conclusion seems to be; do not send serviceAccountRoleArn at all, not even null if it is to be unset.

Here is from AWS support:

Firstly, I would like to mention that I have already reviewed the API Calls performed for the UpdateAddon action, and found several events that in fact show the "Cross-account pass role is not allowed." error [1].

The previous events seem to be sending the following request parameters:

    "requestParameters": {
        "resolveConflicts": "OVERWRITE",
        "addonName": "aws-ebs-csi-driver",
        "clientRequestToken": "terraform-20241015135717535200000002",
        "name": "reliability-dev",
        "serviceAccountRoleArn": ""
    },

From the previous parameters, we can observe that the "serviceAccountRoleArn" field has an empty value associated with it. This seems to be the most likely cause for this problem, as per our documentation, this parameter has a Length Constraint of a Minimum length of 1 and Maximum length of 255, so by sending an empty parameter, this api call would not be performed.

Additionally, I have tried to replicate this behaviour by performing the following aws cli commands on my environment, from which we can observe the following information:

  1. When running the command without specifying a role, we get an error as follows: Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver --service-account-role-arn
    Output: aws: error: argument --service-account-role-arn: expected one argument

  2. But when running the command without the --service-account-role-arn parameter, the call is successful: Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver

    Output: {
    "update": {
        "id": "ae7791ff-7249-3e2f-91cd-2d77841837ab",
        "status": "InProgress",
        "type": "AddonUpdate",
        "params": [],
        "createdAt": 1729003738.502,
        "errors": []
    }
  3. Lastly, if a role that does not exist is specified, I get the same error as the one you shared initially: Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver --service-account-role-arn rolethatdoesnotexist
    Output: An error occurred (AccessDeniedException) when calling the UpdateAddon operation: Cross-account pass role is not allowed

Due to the previous, my recommendation would be to either remove the "service_account_role_arn" parameter [3] from your Terraform script if you desire to remove the IAM role when updating the addon. If this is not the case, you would need to explicitly specify an IAM role on this parameter that exists in your account, since no role is being sent at this moment, as it was shown in the API calls in the link [1].