Closed YesYouKenSpace closed 5 years ago
Fixed. I have no idea how but changing the username helped.
hi @kennethtxytqw could you please share what change did help? thanks
I am experiencing a similar problem. I have added the atlantis task role arn to the EKS aws-auth configmap, but when the atlantis launched terraform task tries to operate on the EKS cluster, it fails:
Error: Unauthorized
on .terraform/modules/prometheus_operator/modules/prometheus-operator/main.tf line 36, in resource "kubernetes_namespace" "this":
36: resource "kubernetes_namespace" "this" {
And looking at the EKS authorization logs I see this:
time="2020-05-06T05:17:59Z" level=warning msg="access denied" client="127.0.0.1:55512" error="input token was not properly formatted: X-Amz-Date parameter is expired (15 minute expiration) 2020-05-06 01:09:00 +0000 UTC" method=POST path=/authenticate
It appears that atlantis, or terraform via atlantis, is trying to use a several hour old token to auth to EKS?
@llamahunter did you found a solution for this?
Well, not really. The problem seems to be that the terraform plan caches the eks auth token, so that when you go to apply it later, the tokens are expired. We have to re-plan right before apply, and even then, it's possible that for complex terraform that there will be eks timeouts midway through the apply. We then need to re-plan and re-apply to finish applying the terraform. See https://github.com/terraform-providers/terraform-provider-aws/issues/13189 and https://github.com/hashicorp/terraform/issues/24886
I think @llamahunter is right. We (team at my workplace) have an internal rule that states if
Always re-plan and apply.
@kennethtxytqw, so performing a plan
does recreate the token if the saved one has expired?
@kennethtxytqw, so performing a
plan
does recreate the token if the saved one has expired?
In my experience, yes. However, you can still run into problems if you have a LONG running operation and the token expires in the middle of it. You will need to re-plan and re-apply to pick up from where you left off.
Hello,
as a workaround, we are using an extra plan
step inside the apply
command:
workflows:
myworkflow:
plan:
steps:
- init
- plan
apply:
steps:
# We have an extra plan here because the aws_eks_cluster_auth.token expires within 15min
# https://github.com/runatlantis/atlantis/issues/800
- plan
- apply
This is a bit suboptimal because there might be some unintended/unapproved plan-changes sneaking in. Still need to see if this causes problems in practice.
@flixx couldn't you use a data source to retrieve that information so when you apply the terraform it creates a new token with a new expiration? or is that not correct?
edit: nvm, I see the relevant issue https://github.com/hashicorp/terraform/issues/24886
For now, until that issue is resolved, perhaps you could check the time of when the plan file is generated, if it's been more than X minutes, then run the plan+apply step. If it's less than X minutes, then run only the apply step.
I noticed in the upstream issue that the kubernetes
provider doesn't use an exec
provider "kubernetes" {
host = data.aws_eks_cluster.example.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.example.certificate_authority.0.data)
token = null
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = [
"eks", "get-token", "--cluster-name", local.eks_cluster_id
]
}
}
Excerpt taken from https://github.com/cloudposse/terraform-aws-components/blob/master/modules/eks/efs-controller/provider-helm.tf
@flixx have you tried this method?
@nitrocode Yes, this might work as well - however it would require us to build a custom atlantis docker image with the aws-cli binary included. Something I'd like to avoid at the moment.
Youre correct. However, we highly encourage users to customize the container.
Here's mine for reference. It contains awscli v2 and a number of other tools.
https://github.com/nitrocode/atlantis-terraform-module/blob/main/Dockerfile
I am trying to get atlantis to manage our EKS cluster. Following the instructions here https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
I added the following code to the configmap under mapRoles
I still get this error
Does anyone know of any solution?