Closed paulalex closed 4 years ago
Yes, as per the changelog instructions you must either import the existing aws-auth
configmap into Terraform state, or delete the ConfigMap so it can be created from scratch. But only delete the configmap if you know for certain which IAM user or role created the EKS cluster as they will need to be doing the recreating.
Unfortunately the kubernetes provider does not support force overwriting existing resources.
I struggled to get the import command to work as k8s provider seems to ignore environment variables. I ended up forcing the config to the generated kubeconfig file just for this operation:
provider "kubernetes" {
config_path = "kubeconfig_${var.cluster_name}"
}
@dpiddockcmp Yes I think is where I am also struggling, I had a little more success when I set load_config_file = true
and set export KUBECONFIG
from the commandline but once I can actually upgrade my test cluster I need to get this to work from a jenkins pipeline and doing the above wont be an option.
I am going to try your suggestion above and see if this gives me anymore success. I dont know if its coincidence but I destroyed my cluster again and then ran the apply from fresh and this time it worked without timing out. I dont know if that is a red herring..
Also, should this:
terraform import module.cluster1.kubernetes_config_map.aws_auth[0] kube-system/aws-auth
Actually be:
terraform import module.eks.kubernetes_config_map.aws_auth[0] kube-system/aws-auth
If you create the cluster from scratch, you do not need to import a previously created aws-auth configmap as it should not exist. The module should create it properly for you.
The import path depends on what you've called your module definition in your configuration. It might not be at the top level if you have the definition in a sub module. There is no single or correct answer here. Maybe the changelog could be better aligned with the examples which usually use "eks"
?
Not sure how you would pipeline this action. It only needs doing once per cluster for the pre-8 to 8 upgrade.
Not sure how you would pipeline this action. It only needs doing once per cluster for the pre-8 to 8 upgrade.
Maybe for this reason I could do it manually and then upgrade the cluster using the pipeline and the latest version of the module afterwards.
I rolled back to my original version and rebuilt my cluster as it is on prod right now. Next I added the kubernetes provider to my main.tf
and then updated the terraform state using the following command:
terraform import -var-file=../../tfvars/dev.tfvars module.eks.kubernetes_config_map.aws_auth[0] kube-system/aws-auth
This results in the following output (all looks good?):
Import successful!
The resources that were imported are shown above. These resources are now in
your Terraform state and will henceforth be managed by Terraform.
I then ran an apply with the v8.0.0 and I still get the same error:
module.eks.aws_launch_configuration.workers[0]: Destruction complete after 1s
Error: Failed to update Config Map: Unauthorized
on ../../modules/eks/aws_auth.tf line 52, in resource "kubernetes_config_map" "aws_auth":
52: resource "kubernetes_config_map" "aws_auth" {
@dpiddockcmp Is the same issue you got when trying to use the import command? My provider looks like this currently:
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
# config_path = "kubeconfig_${var.eks_cluster_name}"
version = "~> 1.10"
}
My error was on the import command. It was ignoring all settings and picking up the default kube config file in ~/.kube/config. Trying to read from minikube which didn't have the configmap.
Seems weird that you get permission denied on update if you have just created the cluster.
Seems weird that you get permission denied on update if you have just created the cluster.
Yes I know, I am pretty much out of ideas.
Seems weird that you get permission denied on update if you have just created the cluster.
To add to this... I thought I would humour myself and press up on my kb and run the exact same apply again, and it worked in literally a second.
How is that possible that I get permission denied, then its ok?
@dpiddockcmp I think I am going to run through the entire set of steps again but I wondered if you could correct my thinking if it sounds completely incorrect.
Could the reason that it fails and then works the second time be because the kubernetes provider has picked up the old values for the token, and cluster ca certificate and it tries to use these to update the config map and the build fails for a permission denied.
On the second apply it actually gets the new values and so the apply works this time. Its just a thought.
@dpiddockcmp I did a little bit more testing today and I output the cluster token in my develop branch build, and then again whlist trying to upgrade to v8.0.0 of the eks module after it fails and then is successful on the second run.
So it looks like the cluster auth token that is retrieved by the kubernetes provider when I start the apply of the 8.0.0 version of the module is actually changed midway through upgrading.
Could this be a side effect of the fact that I am on 1.13 and when I upgrade to v8.0.0 it also upgrades my eks version to 1.14?
Here are the tokens from the two consecutive runs of a terraform apply (first to build the cluster from scratch using my develop branch, and then to upgrade to version 8.0.0 of the module (which first fails on the initial apply even though the cluster upgrades and then is successful on second run).
First build cluster from scratch:
aws_eks_cluster_token = k8s-aws-v1.aHR0cHM6Ly9zdHMuYW1hem9uYXdzLmNvbS8_QWN0aW9uPUdldENhbGxlcklkZW50aXR5JlZlcnNpb249MjAxMS0wNi0xNSZYLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFTSUEzTVNLU1FKUkUyREs2N0kyJTJGMjAyMDAxMjElMkZ1cy1lYXN0LTElMkZzdHMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDIwMDEyMVQxNTA2MTVaJlgtQW16LUV4cGlyZXM9MCZYLUFtei1TZWN1cml0eS1Ub2tlbj1Gd29HWlhJdllYZHpFSXolMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkZ3RWFEQWlYOXF5OXdrcnJWNm5ncXlLcEFlWjJIaDRpTUl0UCUyRmNCV1hMMEtPSWFFSklGcXhzJTJCY3Z0cnFCZUV2VU1DSG1jNk13dyUyRmVPclo2QjRhdWRHYTBHZFVSVUY1SDQyWHNQTlJpZ1g4ZHRsM2R4MmVjRXFsdkNxaUdGMjIzRlQzV1ozTGVnZDJ3cFhlUzRFYVNsTzZCeVNHajd3RGJmTmMzQU1sNVczelZ2TThHdXZmZ3ZocURQYUUxWGZjJTJCRWRtMWFleTJYWmY1YklFZUJxdk9KdlFvUVFpSzdXQSUyQm5QRHdmNjFwTUN2bG1TcHJSSXlFdEhUSXBzMG93NmFiOFFVeUxmNEVHc0hLRElxMU1TM2dRcUFab0ZQeHk2TUolMkJBa3pnSkhobUFDWkFkZmRESlFqcXZJYWh3VHg3a1I3c1ElM0QlM0QmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JTNCeC1rOHMtYXdzLWlkJlgtQW16LVNpZ25hdHVyZT1hNTVhMjQ0NzNlMzA4Y2Y4YTlhYzQwNzI3OTY4YTk3MmNiMjA3ZDhjOTVlYzcyZTk2ZTc4NmEwNGY3NGNlNDY1
Second upgrade to version 8.0.0:
aws_eks_cluster_token = k8s-aws-v1.aHR0cHM6Ly9zdHMuYW1hem9uYXdzLmNvbS8_QWN0aW9uPUdldENhbGxlcklkZW50aXR5JlZlcnNpb249MjAxMS0wNi0xNSZYLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFTSUEzTVNLU1FKUkUyREs2N0kyJTJGMjAyMDAxMjElMkZ1cy1lYXN0LTElMkZzdHMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDIwMDEyMVQxNTM4MTRaJlgtQW16LUV4cGlyZXM9MCZYLUFtei1TZWN1cml0eS1Ub2tlbj1Gd29HWlhJdllYZHpFSXolMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkYlMkZ3RWFEQWlYOXF5OXdrcnJWNm5ncXlLcEFlWjJIaDRpTUl0UCUyRmNCV1hMMEtPSWFFSklGcXhzJTJCY3Z0cnFCZUV2VU1DSG1jNk13dyUyRmVPclo2QjRhdWRHYTBHZFVSVUY1SDQyWHNQTlJpZ1g4ZHRsM2R4MmVjRXFsdkNxaUdGMjIzRlQzV1ozTGVnZDJ3cFhlUzRFYVNsTzZCeVNHajd3RGJmTmMzQU1sNVczelZ2TThHdXZmZ3ZocURQYUUxWGZjJTJCRWRtMWFleTJYWmY1YklFZUJxdk9KdlFvUVFpSzdXQSUyQm5QRHdmNjFwTUN2bG1TcHJSSXlFdEhUSXBzMG93NmFiOFFVeUxmNEVHc0hLRElxMU1TM2dRcUFab0ZQeHk2TUolMkJBa3pnSkhobUFDWkFkZmRESlFqcXZJYWh3VHg3a1I3c1ElM0QlM0QmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JTNCeC1rOHMtYXdzLWlkJlgtQW16LVNpZ25hdHVyZT1jZjY0ZjdjYTA5MTlkY2QwZTJlNzc5ZjA3YTFlNGQ2MzFmODQ4Yjg0ZThlMzI5NzFkOTIyMTUwNDdjODZlN2Rk
Is this the expected behaviour or is this a bug with the kubernetes provider?
The IAM EKS tokens are only valid for 15 minutes. I would expect the token to change on every run of Terraform.
I guess if the window between generating the token and trying to use it to update aws-auth
is too long then you will receive an access denied error.
Does the full output of the apply command give you any hints on when the data source is refreshed?
The IAM EKS tokens are only valid for 15 minutes. I would expect the token to change on every run of Terraform.
This would be the reason then as the apply takes around 19-20 minutes to finish. If I can defer the retrieval of the data items until the cluster is ready then this issue would probably not appear.
The documentation for data providers suggests depends on can be used but is not recommended.
I noticed this issue while trying to debug my problem, and it seems like it's very similar to the issue I was having.
When creating the eks cluster from scratch, I noticed that the k8s provider wasn't referencing the correct endpoint string. More specifically, using this provider declaration:
provider "kubernetes" {
alias = "kubernetes-utility"
host = data.aws_eks_cluster.eks-utility.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks-utility.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.eks-utility.token
load_config_file = false
}
Caused this issue after apply:
...
aws_subnet.subnet-50add535-subnet-50add535: Modifications complete after 0s [id=subnet-50add535]
module.eks-utility.kubernetes_config_map.aws_auth[0]: Creating...
Error: Post http://localhost/api/v1/namespaces/kube-system/configmaps: dial tcp [::1]:80: connect: connection refused
on .terraform/modules/eks-utility/terraform-aws-modules-terraform-aws-eks-c9b9c96/aws_auth.tf line 52, in resource "kubernetes_config_map" "aws_auth":
52: resource "kubernetes_config_map" "aws_auth" {
However, when I move the generated kubeconfig_<eks_name>
to ~/.kube/config
it seems to have a different error (Error: Unauthorized
):
...
aws_iam_policy_attachment.AmazonEKS_CNI_Policy-policy-attachment: Modifications complete after 1s [id=AmazonEKS_CNI_Policy-policy-attachment]
aws_iam_policy_attachment.AmazonEKSClusterPolicy-policy-attachment: Modifications complete after 1s [id=AmazonEKSClusterPolicy-policy-attachment]
aws_iam_policy_attachment.AmazonEC2ContainerRegistryReadOnly-policy-attachment: Modifications complete after 1s [id=AmazonEC2ContainerRegistryReadOnly-policy-attachment]
Error: Unauthorized
on .terraform/modules/eks-utility/terraform-aws-modules-terraform-aws-eks-c9b9c96/aws_auth.tf line 52, in resource "kubernetes_config_map" "aws_auth":
52: resource "kubernetes_config_map" "aws_auth" {
This implies that the kubernetes
terraform provider is still trying to read the config, instead of referencing the newly created eks cluster which was declared as a tf resource.
Do think this is a terraform bug?
@dpiddockcmp To get around the issue you had:
My error was on the import command. It was ignoring all settings and picking up the default kube config file in ~/.kube/config. Trying to read from minikube which didn't have the configmap.
I set load_config_file = true
and then exported KUBE_CONFIG
from the commandline and this seemed to get around that, then changed it back to false to run the apply.
@paulalex can we close this issue ? It seems that you solved your problem.
@barryib sure I will close it now, I was not able to defer the loading of the data until later on as the module went into error so right now the solution for me is to run it twice and manually which isnt ideal but its unrelated to this issue.
Cheers
@dotCipher did you manage to work out your issue? I have terraform apply working fine when running terraform from my laptop, even using the assumed role credentials output into the Jenkins log file and running kubectl commands such as kubectl get cm aws-auth -n kube-system
outputs the config map.
I think there is a bug in the provider because when I run terraform apply on jenkins using the same credentials I get the same error as you get when you export your config to ~/.kube/config
:
module.eks.kubernetes_config_map.aws_auth[0]: Refreshing state... [id=kube-system/aws-auth]
2020/02/07 10:24:11 [ERROR] module.eks: eval: *terraform.EvalRefresh, err: Unauthorized
2020/02/07 10:24:11 [ERROR] module.eks: eval: *terraform.EvalSequence, err: Unauthorized
And in my jenkins file log: Error: Unauthorized
My jenkins server is running inside another eks cluster used for management tools and this terraform build is running inside a pod and it is building\managing another eks cluster in a different aws account so I dont know if this is in some way related.
I have got this fixed on jenkins now... finally! So if you have this issue and are looking for answers see this issue:
https://github.com/terraform-providers/terraform-provider-kubernetes/issues/716
In short, inside the pod running your terraform build (if on a kubernetes cluster), then removing this environment variable should fix the issue:
KUBERNETES_SERVICE_HOST
Thanks, that helped
Just in case you end up here (like me) while
terraform-aws-eks
module version which is using local-exec kubectl to new version calling kubernetes module directlyError: configmaps "aws-auth" already exists
Error: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused
Fix:
load_config_file = true
and config_path ="your_kubeconf"
@rastakajakwanna ,regarding your comment https://github.com/terraform-aws-modules/terraform-aws-eks/issues/699#issuecomment-600729809 , I think we share the same situation ( 1, 2, 3) . However, I was not able to get how you fix it. Could you please elaborate more ? For example, post the snippet of TF Code before the Fix , and the new Snippet of TF code after the Fix ?
Appreciated!
@abdennour This is the same issue I initially had and the same process fixed it for me. In the provider set load_config_file = true
and then in your terminal session export KUBECONFIG=<your_config_path>
and then run your terraform and see if this helps.
@paulalex Everybody suggests exporting KUBECONFIG variable while one can define it in provider.tf instead. That is the difference in my answer in comparison with the others replies.
@abdennour Dynamic config in provider.tf (fails with errors)
provider "kubernetes" {
host = module.eks.cluster_endpoint
#cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks.certificate_authority.0.data)
# or use module output
cluster_ca_certificate = base64decode(module.eks.cluster_ca)
token = data.aws_eks_cluster_auth.eks.token
load_config_file = false
version = "~> 1.9"
}
Change temporarily to static
provider "kubernetes" {
host = module.eks-dev-infra.cluster_endpoint
load_config_file = true
# kubeconfig file relative to path where you execute tf, in my case it is the same dir
config_path = "kubeconfig_eks.yaml"
version = "~> 1.9"
}
Then change back to dynamic.
However, I've found today, that it's not working for my Gitlab pipeline (due to lack of privileges to call eks api). But that's outside of the scope of this comment.
@rastakajakwanna thank you so much. @paulalex BTW, I am running terraform process inside container and I've already passed KUBECONFIG env file. But it seems the kubernetes provider looks into ~/.kube/config and ignore KUBECONFIG. So what I did , I created my container image based on Terraform but with a customized entrypoint :
#!/bin/bash
if [ -f "${KUBECONFIG}" ] then;
mkdir -p ${HOME}/.kube
cat ${KUBECONFIG} > ${HOME}/.kube/config
fi
exec $@
Did you change the property load_config_file
of the provider to true?
@paulalex You are right.My mistake that I didn't read the documentation of this provider. Now things work without my custom image. reverting back to hashicorp/terraform official images.
@rastakajakwanna while upgrading from 7.0.0 to 10.0.0 All issues are gone except this one Error: configmaps "aws-auth" already exists
.
I started by the static config :
provider "kubernetes" {
host = module.eks.cluster_endpoint
load_config_file = true
# kubeconfig file relative to path where you execute tf, in my case it is the same dir
config_path = "kubeconfig_${local.cluster_name}"
version = "~> 1.9"
}
Should I delete the aws-auth configmap (kubect -n kube-system delete cm aws-auth....) before upgrading the cluster.
@paulalex any thoughts?
I was able to finally figure this out by importing my actual aws-auth configmap but then it got overwritten. is there a way to prevent the terraform module from applying it?
You can stop the module from managing the aws-auth configmap by setting manage_aws_auth = false
in your module block.
Warning: It will delete your configmap if you have already imported it to the terraform state. Remove it with e.g. terraform state rm module.eks.kubernetes_config_map.aws_auth
before applying!
@What is the best option if i manage two clusters in one code? i don't use enviromentes or so.
@What is the best option if i manage two clusters in one code? i don't use enviromentes or so.
I never tested it, but I think you can use providers with aliases. One alias for one cluster and provide those aliases to terraform-eks-aws module.
More info https://www.terraform.io/docs/configuration/providers.html#selecting-alternate-providers
@barryib this seem to work! thx
I am having the same issue. Created an eks cluster and every created successfully except for the aws_auth config map.
But I cannot connect to the eks cluster at all.
I run the aws eks update-kubeconfig
commnad and it successfully updates my .kube/config
But when executing any kubectl
command, it fails with :
error: You must be logged in to the server (Unauthorized)
So cannot connect to it at all, hence any tampering with the provider to pass it details won't work in my case.
Any ideas why this would be the case; is something else going wrong with the creation for this to fail ?
Are you accessing the cluster with the same user that created the cluster originally?
Clusters must be created with an IAM user or role. Do not use the root account, you will not be able to login.
Yup, that was the issue, not using the same role.
I had an Unauthorized config map error as well. I deleted a stale ~/.kube/config file and ran apply again which worked.
I got into same issue. I resolved with some sorts of observations. the cluster was created using https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v6.0.1/aws_auth.tf and was applying changes through updated eks terraform that is https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v12.2.0/aws_auth.tf have observed the creation of resource is totally different what i did, i deleted the old aws-auth configmap which were existed and did apply it worked perfectly.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
I have issues
I am unable to upgrade to v8.0.0 of the module due to issues with the aws-auth config map, prior to this upgrade the module worked fine for me.
I'm submitting a...
What is the current behavior?
Trying to upgrade to version 8.0.0 of module and terraform apply results in a permissions issue for aws-auth config map.
I cannot successfully run a terraform apply and move to eks 1.14 because I get errors from terraform regarding the config map (see below), I am accessing the cluster from my macbook with kubeconfig for the cluster admin user, so should have admin permissions to the cluster.
If this is a bug, how to reproduce? Please include a code sample if relevant.
Upgrade following the upgrade steps and then run a terraform apply.
Question - Is the following import command from the important notes section of the upgrade documentation correct?
terraform import module.cluster1.kubernetes_config_map.aws_auth[0] kube-system/aws-auth
What's the expected behavior?
The module is upgraded to 8.0.0 and the cluster upgrades, with no errors and terraform apply works without permission issues regarding the aws-auth config map.
Are you able to fix this problem and submit a PR? Link here if you have already.
No
Environment details
eks v 1.13
Any other relevant info
If I delete my cluster and run version 8.0.0 then the apply times out after 15 minutes and I cannot then run terraform apply again because the eks control plane already exists.
The first time I try to run apply using the latest version 8.0.0 of the module I get a permissions error with the aws-auth config map.
Any subsequent attempts to run apply after this result in an error that permission was denied to the config map.