hashicorp / terraform-provider-kubernetes

Terraform Kubernetes provider
https://www.terraform.io/docs/providers/kubernetes/
Mozilla Public License 2.0
1.59k stars 974 forks source link

Provider configuration is ignored and wrong cluster is modified on apply #2295

Closed oferca closed 2 days ago

oferca commented 1 year ago

Terraform Version, Provider Version and Kubernetes Version

Terraform version: 1.3.9, 1.5.7
Kubernetes provider version: v1.13.4
Kubernetes version: Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GoVersion:"go1.19.4",  Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.17-eks-2d98532", GoVersion:"go1.19.6", Platform:"linux/amd64"}

Affected Resource(s)

Terraform Configuration Files

terraform {
  required_version = ">= 1.1.9"
  required_providers {
  aws = {
    source  = "hashicorp/aws"
    version = "4.48.0"
  }
}
}
// Note that NEITHER local.eks_cluster_name nor data.aws_eks_cluster exist in root module, they exist in completely unrelated folder with different cluster config. **This should have resulted in validation error**
provider "kubernetes" {
  alias                  = "v1beta1"
  host                   = data.aws_eks_cluster.staging.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.staging.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
  version                = "~> 1.9"
  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", local.eks_cluster_name]
  }
}
resource "kubernetes_config_map" "aws_auth" {
    binary_data = {}
    data        = {
        "mapRoles" = var.map_roles
        "mapUsers" = var.map_users
    }
    metadata {
        name             = "aws-auth"
        namespace        = "kube-system"
    }
  }

Debug Output

module.karpenter_infra.module.config_map_aws_auth[0].kubernetes_config_map.aws_auth will be updated in-place ~ resource "kubernetes_config_map" "aws_auth" { ~ data = { ~ "mapRoles" = <<-EOT

Steps to Reproduce

  1. kubectl config use-context WRONG_CLUSTER

  2. terraform apply in folder with TARGET_CLUSTER provider configuration --> WRONG_CLUSTER configuration gets modified

Expected Behavior

Validation error as neither local.eks_cluster_name nor data.aws_eks_cluster exist.

Actual Behavior

Config map aws-auth in an unrelated (cached ?) cluster got modified

BBBmau commented 1 year ago

Hello @oferca thank you for opening this issue. Could you provide more context to this issue by including a trace of when you get the output? This can be done by running TF_LOG=trace terraform apply

oferca commented 1 year ago

@BBBmau sure, below is "TF_LOG=trace terraform plan" ( naturally, I cannot apply this as it affects production clusters ) As you can see, the eks cluster being modified is not the one that is configured in the provider definition, but what happens to be in kubectl context at that moment.

Strangely enough, issue is prevented by the following steps:

  1. Add the data source aws_eks_cluster.staging that is referenced in the code above. It was completely missing and terraform still produced no errors on plan and apply as it should.
  2. Add providers block in module call. Shouldn't have mattered either as kubernetes provider should have defaulted to the top level provider

Terraform plan output:

Acquiring state lock. This may take a few moments... module.module_a.kubernetes_namespace.ns["karpenter"]: Refreshing state... [id=karpenter] module.module_a.module.karpenter.data.aws_caller_identity.current: Reading... module.module_a.data.aws_security_groups.eks-security-groups: Reading... module.module_a.data.aws_eks_cluster_auth.cluster: Reading... module.module_a.data.aws_eks_cluster_auth.cluster: Read complete after 0s [id=CORRECT_EKS_CLUSTER_ID] module.module_a.data.aws_iam_policy_document.node_trust_policy: Reading... module.module_a.data.aws_iam_policy_document.node_trust_policy: Read complete after 0s [id=2560088296] module.module_a.data.aws_subnets.eks-subnets: Reading... module.module_a.data.aws_eks_cluster.staging: Reading... module.module_a.module.karpenter.data.aws_partition.current: Reading... module.module_a.module.karpenter.aws_sqs_queue.this[0]: Refreshing state... [id=https://sqs.us-east-1.amazonaws.com/ACCOUNT_ID/Karpenter-CORRECT_EKS_CLUSTER_ID] module.module_a.module.karpenter.data.aws_partition.current: Read complete after 0s [id=aws] module.module_a.module.karpenter.aws_cloudwatch_event_rule.this["spot_interupt"]: Refreshing state... [id=KarpenterSpotInterrupt-20230615115315062200000003] module.module_a.module.karpenter.aws_cloudwatch_event_rule.this["health_event"]: Refreshing state... [id=KarpenterHealthEvent-20230615115315062100000001] module.module_a.module.karpenter.aws_cloudwatch_event_rule.this["instance_rebalance"]: Refreshing state... [id=KarpenterInstanceRebalance-20230615115315062100000002] module.module_a.module.karpenter.aws_cloudwatch_event_rule.this["instance_state_change"]: Refreshing state... [id=KarpenterInstanceStateChange-20230615115315062400000005] module.module_a.aws_iam_role.karpenter_node_role[0]: Refreshing state... [id=KarpenterNodeRole-CORRECT_EKS_CLUSTER_ID] module.module_a.module.karpenter.data.aws_caller_identity.current: Read complete after 0s [id=ACCOUNT_ID] module.module_a.module.karpenter.data.aws_iam_policy_document.assume_role[0]: Reading... module.module_a.module.karpenter.data.aws_iam_policy_document.assume_role[0]: Read complete after 0s [id=2560088296] module.module_a.module.karpenter.aws_iam_role.this[0]: Refreshing state... [id=Karpenter-CORRECT_EKS_CLUSTER_ID-20230615115315062300000004] module.module_a.data.aws_eks_cluster.staging: Read complete after 1s [id=CORRECT_EKS_CLUSTER_ID] module.module_a.data.aws_subnets.eks-subnets: Read complete after 1s [id=us-east-1] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-1"]: Refreshing state... [id=subnet-id-1,karpenter.sh/discovery] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-2"]: Refreshing state... [id=subnet-id-2,karpenter.sh/discovery] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-3"]: Refreshing state... [id=subnet-id-3,karpenter.sh/discovery] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-4"]: Refreshing state... [id=subnet-id-4,karpenter.sh/discovery] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-5"]: Refreshing state... [id=subnet-id-5,karpenter.sh/discovery] module.module_a.aws_ec2_tag.subnet_tags["subnet-id-6"]: Refreshing state... [id=subnet-id-6,karpenter.sh/discovery] module.module_a.data.aws_security_groups.eks-security-groups: Read complete after 1s [id=us-east-1] module.module_a.aws_ec2_tag.sg_tags["sg-id-1"]: Refreshing state... [id=sg-id-1,karpenter.sh/discovery] module.module_a.module.karpenter.data.aws_iam_policy_document.queue[0]: Reading... module.module_a.module.karpenter.data.aws_iam_policy_document.queue[0]: Read complete after 0s [id=325117857] module.module_a.module.karpenter.aws_cloudwatch_event_target.this["instance_rebalance"]: Refreshing state... [id=KarpenterInstanceRebalance-20230615115315062100000002-KarpenterInterruptionQueueTarget] module.module_a.module.karpenter.aws_cloudwatch_event_target.this["spot_interupt"]: Refreshing state... [id=KarpenterSpotInterrupt-20230615115315062200000003-KarpenterInterruptionQueueTarget] module.module_a.module.karpenter.aws_cloudwatch_event_target.this["instance_state_change"]: Refreshing state... [id=KarpenterInstanceStateChange-20230615115315062400000005-KarpenterInterruptionQueueTarget] module.module_a.module.karpenter.aws_cloudwatch_event_target.this["health_event"]: Refreshing state... [id=KarpenterHealthEvent-20230615115315062100000001-KarpenterInterruptionQueueTarget] module.module_a.module.karpenter.aws_sqs_queue_policy.this[0]: Refreshing state... [id=https://sqs.us-east-1.amazonaws.com/ACCOUNT_ID/Karpenter-CORRECT_EKS_CLUSTER_ID] module.module_a.aws_iam_instance_profile.karpenter_node[0]: Refreshing state... [id=KarpenterNodeInstanceProfile-CORRECT_EKS_CLUSTER_ID] module.module_a.module.config_map_aws_auth[0].kubernetes_config_map.aws_auth: Refreshing state... [id=kube-system/aws-auth] module.module_a.module.karpenter.aws_iam_instance_profile.this[0]: Refreshing state... [id=Karpenter-CORRECT_EKS_CLUSTER_ID-20230615115316864800000006] module.module_a.module.karpenter.aws_iam_role_policy_attachment.this["arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"]: Refreshing state... [id=Karpenter-CORRECT_EKS_CLUSTER_ID-20230615115315062300000004-20230615115317098700000007] module.module_a.module.karpenter.aws_iam_role_policy_attachment.this["arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"]: Refreshing state... [id=Karpenter-CORRECT_EKS_CLUSTER_ID-20230615115315062300000004-20230615115317339800000008] module.module_a.module.karpenter.aws_iam_role_policy_attachment.this["arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"]: Refreshing state... [id=Karpenter-CORRECT_EKS_CLUSTER_ID-20230615115315062300000004-20230615115317549400000009]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: \~ update in-place

Terraform will perform the following actions:

# module.module_a.module.config_map_aws_auth[0].kubernetes_config_map.aws_auth will be updated in-place ~ resource "kubernetes_config_map" "aws_auth" { ~ data = { ~ "mapRoles" = <<-EOT

oferca commented 1 year ago

@BBBmau I have gained additional insight to the issue. It appears that the bug is that modules must use providers meta argument when leveraging kubernetes resources. Without a provider's block the kubernetes context is taken from kubectl, in contradiction with terraform's provider inheritence pattern which specifies it should be taken from the root module. ( in our case the provider also had invalid argument that for some reason were not detected by terraform, so there is probably more to the bug. Parhaps in the context of terraform itself )

module "some-k8s-module" {
    source = "../modules/k8s-module"

     // The bug: without below provider meta argument k8s context is wrongly taken from kubectl and not from root provider
     providers = {
        kubernetes = kubernetes.v1beta1
    }
}
alexsomesan commented 1 year ago

@oferca There is a small difference between your use-case and what is described in terraform's provider inheritence pattern. That documentation states that only default providers (ones without an alias) will be inherited in case no providers block is present on the module. In your case, the configuration that you shared shows a provider block WITH an alias, thus not subject to that rule.

Not related to the above, the Kubernetes provider will not automatically pick up any configuration present in a kubeconfig file unless explicitly instructed so by either setting the config_path attribute or the KUBE_CONFIG_PATH environment variable. Can you please confirm that your environment does not include any variables specific to the Kubernetes provider?

oferca commented 1 year ago

@alexsomesan Confirmed. KUBE_CONFIG_PATH was empty, and config_path attribute does not exist in the configuration. (the configuration can be seen at the initial description) I also double checked: "host" attribute pointed to an eks cluster that was not the one the cm was applied on. So even if for example "config_path" existed there should have been a mismatch error with the "host" attribute.

Also there was an invalid variable. "local.eks_cluster_name" simply did not exist. Neither did "data.aws_eks_cluster.staging.endpoint". "terraform plan" should have ended with an error: "local.eks_cluster_name is not defined" and it didn't. It finished the "plan" and "apply" steps successfully. Ignoring the provider configuration.

github-actions[bot] commented 1 month ago

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!