hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.76k stars 9.11k forks source link

[Bug]: aws_eks_cluster: adding `access_config` block triggers cluster recreate in absense of `bootstrap_cluster_creator_admin_permissions` #38967

Open ei-grad opened 1 month ago

ei-grad commented 1 month ago

Terraform Core Version

1.9.4

AWS Provider Version

5.63.1

Affected Resource(s)

Expected Behavior

Cluster access config updated, bootstrap_cluster_creator_admin_permissions remains unspecified

Actual Behavior

Unspecified field bootstrap_cluster_creator_admin_permissions value becomes true, triggering cluster recreation.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

data "aws_subnets" "private" {
  filter {
    name   = "map-public-ip-on-launch"
    values = ["false"]
  }
}

data "aws_iam_policy_document" "assume_role" {
  statement {
    principals {
      type        = "Service"
      identifiers = ["eks.amazonaws.com"]
    }
    actions = ["sts:AssumeRole"]
  }
}

resource "aws_iam_role" "cluster" {
  name = "eks-cluster"
  assume_role_policy = data.aws_iam_policy_document.assume_role.json
  managed_policy_arns = [
    "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy",
  ]
}

resource "aws_eks_cluster" "this" {
  name     = "test"
  role_arn = aws_iam_role.cluster.arn
  vpc_config {
    subnet_ids = data.aws_subnets.private.ids
  }
  /*access_config {
    authentication_mode = "API_AND_CONFIG_MAP"
  }*/
}

Steps to Reproduce

  1. Create EKS cluster without access_config
  2. Uncomment access_config with authentication_mode = "API_AND_CONFIG_MAP" set only
  3. See terraform plan containing cluster destroy to recreate
    # aws_eks_cluster.this must be replaced
    -/+ resource "aws_eks_cluster" "this" {
      ...
      ~ access_config {
          ~ authentication_mode                         = "CONFIG_MAP" -> "API_AND_CONFIG_MAP"
          - bootstrap_cluster_creator_admin_permissions = true -> null # forces replacement
        }
      ...
    }

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

Probably caused by #38295

Probably duplicate - #38950

Would you like to implement a fix?

I’m definitely interested in working on a fix and feel confident I understand how to address the problem, but I’m not sure when I’ll have the time, as it’s currently outside the scope of my tasks.

github-actions[bot] commented 1 month ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

ei-grad commented 1 month ago

cc @sasidhar-aws

bryantbiggs commented 2 weeks ago

there isn't anything that Terraform can do for this - it is dictated by the EKS API - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-eks-cluster-accessconfig.html#cfn-eks-cluster-accessconfig-bootstrapclustercreatoradminpermissions

Update requires: Replacement

ei-grad commented 2 weeks ago

@bryantbiggs Thank you for your attention to this issue. I understand that the need for cluster recreation due to intentional changes in bootstrap_cluster_creator_admin_permissions is dictated by the EKS API rather than Terraform's behavior. However, BootstrapClusterCreatorAdminPermissions field is optional, and the situation where introduction of the access_config field triggers cluster recreation when bootstrap_cluster_creator_admin_permissions is not explicitly set seems unexpected and could lead to accidental disasters. The choice of behavior in this case is entirely within the provider's responsibility, and this behavior should be changed to avoid such cases.

ei-grad commented 2 weeks ago

@bryantbiggs Actually, your comment may be relevant to the issue #38950, which is similar but specifically mentions the explicit change in the access_config.bootstrap_cluster_creator_admin_permissions value (it was the reason I went to create this one).

ei-grad commented 2 weeks ago

Oops, change in description triggered some extra tagging based on provided resources configuration unrelated to actually affected service :-/

bryantbiggs commented 2 weeks ago

However, BootstrapClusterCreatorAdminPermissions field is optional, and the situation where introduction of the access_config field triggers cluster recreation when bootstrap_cluster_creator_admin_permissions is not explicitly set seems unexpected and could lead to accidental disasters. The choice of behavior in this case is entirely within the provider's responsibility, and this behavior should be changed to avoid such cases.

I don't follow - could you walk me through the steps you have performed and the outcome you are seeking?

ei-grad commented 2 weeks ago

could you walk me through the steps you have performed and the outcome you are seeking?

  1. I have an EKS cluster that was created via terraform using the aws_eks_cluster resource without specifying the access_config block.
  2. I later realized that the access entries feature for EKS is not enabled by default, so I added the access_config block and set only the authentication_mode field to "API_AND_CONFIG_MAP" to enable it.
  3. I noticed that terraform apply was planning to recreate my EKS cluster instead of just enabling access entries.

I described these steps in the "Steps to Reproduce" section.

Afterward, I added the "missing" value for the bootstrap_cluster_creator_admin_permissions parameter to satisfy Terraform, and I was able to successfully enable access entries authentication on my EKS cluster with the next terraform apply run.

To sum up the current behavior: Terraform triggers a recreation of the aws_eks_cluster created without the access_config block when you add access_config with only the authentication_mode field specified.

However, changing the authentication_mode field does not require the cluster to be recreated, and the expected behavior would be for Terraform to update the access_config without forcing a cluster replacement, omitting the BootstrapClusterCreatorAdminPermissions parameter in API call if it is not specified in the resource configuration.

An alternative solution could be making bootstrap_cluster_creator_admin_permissions a required parameter with the default value set to the EKS default (true), though I’m not sure if this would be ideal in terms of backwards compatibility. Another option could be simply setting its default to true, as suggested in #38950, but this could also trigger unexpected, accidental cluster recreation when adding the access_config block without an explicit bootstrap_cluster_creator_admin_permissions value for a cluster that was externally configured with bootstrap_cluster_creator_admin_permissions = false.

bryantbiggs commented 2 weeks ago

ah ok - understood and thank you for that detailed info. I will take a look at this now

bryantbiggs commented 2 weeks ago

this one might be outside of my wheelhouse - I have a test that shows this behavior is not working as intended, but not sure of how to resolve it. It is somewhat tricky since the value of bootstrap_cluster_creator_admin_permissions is never returned in a cluster describe API call, nor in the response of the cluster create API call - its a value that can only be set on create, but somehow we need Terraform to treat null and true as the same (and a no-op)

I'll leave my PR in case someone has time/bandwidth to dig in further - it at least should provide a good starting point

afischer-opentext-com commented 1 week ago

This is a blocker for updating existing clusters using newer providers, as forcibly configuring value of parameter bootstrap_self_managed_addons true or null enforces a recreation of existing EKS clusters and setting it false renders the cluster unusable as the managed addons are no more present.

Is anyone aware of workarounds or hacks allowing to use newer provider versions without cluster recreation?