terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws
Apache License 2.0
4.44k stars 4.06k forks source link

configmap/aws-auth doesn't update when we use workers_role_name #610

Closed Ventals closed 4 years ago

Ventals commented 4 years ago

I have issues

When we setting workers_role_name parameter in existing cluster, configmap/aws-auth isn't modify

I'm submitting a...

What is the current behavior?

Module creates new IAM role, set it for a worker, but didn't change configmap/aws-auth

Nodes freeze in not ready state:

NAME                           STATUS     ROLES    AGE   VERSION
ip-xx-xx-xxx-xx.ec2.internal   NotReady   <none>   63d   v1.14.6-eks-5047ed
ip-xxx-xx-xx-xx.ec2.internal   NotReady   <none>   63d   v1.14.6-eks-5047ed

And on any pod we have following error:

Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)

When we check configmap/aws-auth we can see past role name kubectl -n kube-system get configmap aws-auth -o yaml

Part of configmap:

apiVersion: v1
data:
  mapRoles: |
    - rolearn: arn:aws:iam::<account-id>:role/<past_role_arn>

To solve the problem just paste new role arn

If this is a bug, how to reproduce? Please include a code sample if relevant.

  1. Set workers_role_name = "any-workers-iam-role" parameter on existing cluster
  2. Terraform apply

What's the expected behavior?

Regenerate configmap/aws-auth when we change IAM role for workers

Are you able to fix this problem and submit a PR? Link here if you have already.

No :(

Environment details

Any other relevant info

max-rocket-internet commented 4 years ago

Can you retry with current master?

legkovalex commented 4 years ago

Need to add into your template:


data "aws_eks_cluster" "cluster" {
  name = module.your-cluster.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.your-cluster.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
  version                = "~> 1.9"
}
`
Ventals commented 4 years ago

@max-rocket-internet @legkovalex I don't test on master because he has some trouble with stability, but on 7.0.0 release this problem still existing

I mean this one

355

mmack commented 4 years ago

Same here.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dusansusic commented 4 years ago

This exists in 9.0.0, too. Really nasty bug when you can have serious outage. Using MNGs and this exists.

max-rocket-internet commented 4 years ago

PRs welcome 😄

nikitacr7 commented 4 years ago

I've also faced the same issue in 9.0.0

barryib commented 4 years ago

@nikitacr7 can you please provide more details. Your existing cluster was deployed with this module ?

Does your configmap/aws-auth already imported with https://www.terraform.io/docs/providers/kubernetes/r/config_map.html#import ?

Here are some important notes about the configmap/aws-auth https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v9.0.0/CHANGELOG.md#important-notes-1

dpiddockcmp commented 4 years ago

It looks like under v9.0.0 you need to double apply for the changes to be fully applied to the aws-auth ConfigMap. First apply updates the roles, second updates aws-auth.

Generation of aws-auth was re-worked in v11.0.0. Some basic testing suggests it has unintentionally solved this issue.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.