Open jgoldschrafe opened 1 year ago
Voting for Prioritization
Volunteering to Work on This Issue
@jgoldschrafe - I took a look at fixing this. The good news is I can reproduce the problem in a test case so it doesn't actually matter if the IRSA is added behind Terraform's back or not.
The bad news is that I haven't yet figured out a way to remove the IRSA from the add-on. It looks like you can't even do it from the AWS console; once I add an IRSA then the "Edit" screen for the addon will allow me to change the role to another IAM-provided one but I can't actually ask it to revert back to a none-IAM role.
If you've found a way to do this in the console or via the aws or eksctl CLI tools then I might be able to map that to the underlying API calls we need to make. But I suspect you might have to take this issue up with AWS directly to see if this is, or can be, supported. Raising it at https://github.com/aws/containers-roadmap/issues might be a good first step; I had a quick search through the existing issues, both open and closed, and couldn't immediately see anything relevant.
This happens for us even when using service_account_role_arn.. the first time we add/change service_account_role_arn to the resource it applies the annotation. Then if you run it again it will remove the arn annotation from the service account.
Having this issue with coredns. we mistakenly added a service_account_role_arn to our coredns add_on resource and now we cannot remove it.
Noticed that if you assign a role that is not really a valid IRSA role to the service_account_role_arn using TF it's now possible to re-save the addon in the console and it will have IRSA role: Not Set. This came up in the migration to EKS Pod Identity and the role names was reused.
Still can't un-set the service_account_role_arn as it explodes with:
╷
│ Error: updating EKS Add-On (reliability-dev:aws-ebs-csi-driver): operation error EKS: UpdateAddon, https response error StatusCode: 403, RequestID: 5bfc50dc-8c8d-47f2-818d-77ac95da2045, api error AccessDeniedException: Cross-account pass role is not allowed.
│
│ with module.terraform_aws_eks_module.module.eks.aws_eks_addon.this["aws-ebs-csi-driver"],
│ on .terraform/modules/terraform_aws_eks_module.eks/main.tf line 484, in resource "aws_eks_addon" "this":
│ 484: resource "aws_eks_addon" "this" {
│
╵
╷
│ Error: updating EKS Add-On (reliability-dev:vpc-cni): operation error EKS: UpdateAddon, https response error StatusCode: 403, RequestID: 086571c4-7d94-4dcd-8f1a-5e942863807f, api error AccessDeniedException: Cross-account pass role is not allowed.
│
│ with module.terraform_aws_eks_module.module.eks.aws_eks_addon.before_compute["vpc-cni"],
│ on .terraform/modules/terraform_aws_eks_module.eks/main.tf line 513, in resource "aws_eks_addon" "before_compute":
│ 513: resource "aws_eks_addon" "before_compute" {
But after re-save the error goes away, the pass role error is bogus.
Some more information for @mattburgess
The conclusion seems to be; do not send serviceAccountRoleArn at all, not even null if it is to be unset.
Here is from AWS support:
Firstly, I would like to mention that I have already reviewed the API Calls performed for the UpdateAddon action, and found several events that in fact show the "Cross-account pass role is not allowed." error [1].
The previous events seem to be sending the following request parameters:
"requestParameters": {
"resolveConflicts": "OVERWRITE",
"addonName": "aws-ebs-csi-driver",
"clientRequestToken": "terraform-20241015135717535200000002",
"name": "reliability-dev",
"serviceAccountRoleArn": ""
},
From the previous parameters, we can observe that the "serviceAccountRoleArn" field has an empty value associated with it. This seems to be the most likely cause for this problem, as per our documentation, this parameter has a Length Constraint of a Minimum length of 1 and Maximum length of 255, so by sending an empty parameter, this api call would not be performed.
Additionally, I have tried to replicate this behaviour by performing the following aws cli commands on my environment, from which we can observe the following information:
When running the command without specifying a role, we get an error as follows:
Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver --service-account-role-arn
Output: aws: error: argument --service-account-role-arn: expected one argument
But when running the command without the --service-account-role-arn parameter, the call is successful:
Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver
Output: {
"update": {
"id": "ae7791ff-7249-3e2f-91cd-2d77841837ab",
"status": "InProgress",
"type": "AddonUpdate",
"params": [],
"createdAt": 1729003738.502,
"errors": []
}
Lastly, if a role that does not exist is specified, I get the same error as the one you shared initially:
Command: $ aws eks update-addon --cluster-name my-cluster --addon-name aws-ebs-csi-driver --service-account-role-arn rolethatdoesnotexist
Output: An error occurred (AccessDeniedException) when calling the UpdateAddon operation: Cross-account pass role is not allowed
Due to the previous, my recommendation would be to either remove the "service_account_role_arn" parameter [3] from your Terraform script if you desire to remove the IAM role when updating the addon. If this is not the case, you would need to explicitly specify an IAM role on this parameter that exists in your account, since no role is being sent at this moment, as it was shown in the API calls in the link [1].
Terraform Core Version
1.3.8
AWS Provider Version
4.61.0
Affected Resource(s)
Expected Behavior
The EKS addon should be reconfigured to use the node's credentials instead of the previously-configured IRSA role ARN.
Actual Behavior
The provider emits a fatal validation error that appears to originate from the AWS SDK for Go.
Relevant Error/Panic Output Snippet
Terraform Configuration Files
Steps to Reproduce
aws eks update-addon --cluster-name my-cluster --addon-name vpc-cni --addon-version v1.12.6-eksbuild.1 --service-account-role-arn arn:aws:iam::111122223333:role/MyVPCCNIRole --resolve-conflicts PRESERVE
Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
No response
Would you like to implement a fix?
None