hashicorp / terraform-provider-helm

Terraform Helm provider
https://www.terraform.io/docs/providers/helm/
Mozilla Public License 2.0
989 stars 364 forks source link

Failed helm_release parameters do not get applied but also cannot be replanned #1366

Open davisowb opened 1 month ago

davisowb commented 1 month ago

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 1.4.6
Provider version: 2.13.2
Kubernetes version: 1.29

Affected Resource(s)

Terraform Configuration Files

...
resource helm_release datadog {
  for_each = var.enable_datadog_agent ? { create = true } : {}

  name       = "datadog"
  repository = "https://helm.datadoghq.com"
  chart      = "datadog"
  version    = var.datadog_agent_chart_version  # 3.62.0
  namespace  = kubernetes_namespace.this.metadata[0].name

  set {
    name  = "registry"
    value = "public.ecr.aws/datadog"
  }

  set {
    name  = "datadog.apiKeyExistingSecret"
    value = kubernetes_secret.datadog_keys["create"].metadata[0].name
  }

  set {
    name  = "datadog.appKeyExistingSecret"
    value = kubernetes_secret.datadog_keys["create"].metadata[0].name
  }

  set {
    name  = "clusterAgent.enabled"
    value = true
  }

  set {
    name  = "clusterAgent.metricsProvider.enabled"
    value = true
  }

  set {
    name  = "datadog.logs.enabled"
    value = var.datadog_agent_log_collection_enabled
  }

  set {
    name  = "datadog.logs.containerCollectAll"
    value = true
  }

  set {
    name  = "datadog.processAgent.enabled"
    value = var.datadog_agent_process_collection_enabled
  }

  set {
    name  = "datadog.processAgent.processCollection"
    value = true
  }

  set {
    name  = "datadog.collectEvents"
    value = var.datadog_agent_event_collection_enabled
  }

  set {
    name  = "datadog.networkMonitoring.enabled"
    value = var.datadog_agent_network_collection_enabled
  }

  set {
    name  = "datadog.apm.socketEnabled"
    value = var.datadog_agent_apm_enabled
  }

  set {
    name  = "datadog.apm.portEnabled"
    value = true
  }

  set {
    name  = "datadog.kubeStateMetricsCore.enabled"
    value = true
  }

  dynamic set {
    for_each = local.datadog_pod_tag_mapping

    content {
      name  = "datadog.kubeStateMetricsCore.labelsAsTags.${set.key}"
      value = set.value
    }
  }

  dynamic set {
    for_each = local.datadog_pod_tag_mapping

    content {
      name  = "datadog.podLabelsAsTags.${set.key}"
      value = set.value
    }
  }

  dynamic set {
    for_each = local.datadog_cluster_tags

    content {
      name  = "datadog.tags[${set.key}]"
      value = set.value
    }
  }
}

Steps to Reproduce

  1. Change or add one of the set parameters
  2. terraform plan -out=terraform.tfplan
  3. Wait enough time for the cluster auth token to expire. (2h30 is enough)
  4. Approve the plan (terraform apply --auto-approve terraform.tfplan)
  5. Apply will fail with Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials
  6. Run the plan again terraform plan -out=terraform.tfplan

Expected Behavior

The second plan should show same parameter changes to the helm_release resource as it did the first time around.

Actual Behavior

  1. First plan fails with:

    Acquiring state lock. This may take a few moments...
    module.eks_kubernetes_config.helm_release.datadog["create"]: Modifying... [id=datadog]
    ╷
    │ Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials
    │ 
    │   with module.eks_kubernetes_config.helm_release.datadog["create"],
    │   on .terraform/modules/eks_kubernetes_config/terraform/compute/eks_kubernetes_config/main.datadog.tf line 39, in resource "helm_release" "datadog":
    │   39: resource helm_release datadog {
    │ 
    ╵
    ##[error]Terraform command 'apply' failed with exit code '1'.
  2. Checking the cluster the planned changes are not applied (as expected due to auth failure)

  3. 2nd plan shows no changes, its like the changes got applied to state even though they were not actually implemented.

Community Note

arybolovlev commented 1 month ago

Hi @davisowb,

I was able to reproduce this issue. It looks like this is how Terraform behaves when plan is available and the update operation fails with whatever reason. Please allow me to have a discussion with my team and I will come back to you with more details if I have any.

Thanks!