hashicorp / terraform-provider-kubernetes

Terraform Kubernetes provider
https://www.terraform.io/docs/providers/kubernetes/
Mozilla Public License 2.0
1.6k stars 979 forks source link

lifecycle ignore_changes do not work with "kubernetes_manifest" #1378

Open EcaterinaGr opened 3 years ago

EcaterinaGr commented 3 years ago

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v1.0.4
Kubernetes provider version: 2.4.1
Kubernetes version: 1.20

Affected Resource(s)

Steps to Reproduce

  1. terraform apply - to create new resources
  2. terraform apply - to test the provider will not do any changes

Expected Behavior

The provider should ignore the fields added by k8s controllers, like finalizers

Actual Behavior

The provider detects changes on the resources and tries to update (delete the finalizers). Adding lifecycle.ignore_changes does not work for this resource.

Additional Notes

This bug is same as the one reported in kubernetes alpha provider here

Community Note

bittelc commented 3 years ago

To add a little more detail to the issue reported here:

The diff of resulting manifest shows something like this:

terraform plan ``` -/+ resource "kubernetes_manifest" "istio_operator" { ~ object = { ~ metadata = { - finalizers = [ - "istio-finalizer.install.istio.io", ] -> null } } # forces replacement ```

If the ignore_changes lifecycle field is added, like so:

 lifecycle {
   ignore_changes = [
     object
   ]
 }

the result is that the plan panics with the following output:

panic output ``` Error: Provider produced invalid plan │ │ Provider "registry.terraform.io/hashicorp/kubernetes" planned an invalid value for kubernetes_manifest.istio_operator.object: | ...... lots of stuff │ │ This is a bug in the provider, which should be reported in the provider's own issue tracker. ```
villesau commented 3 years ago

I think this is actually the same issue I was describing elsewhere: https://github.com/hashicorp/terraform/issues/29443

It prevents from using manifests that get finalizers added.

mplewis commented 3 years ago

I have this issue when trying to deploy a Cloud Run Anthos (knative) app to Terraform using the example in the docs: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloud_run_service#example-usage---cloud-run-anthos

ahmet2mir commented 3 years ago

This PR https://github.com/hashicorp/terraform-provider-kubernetes/pull/1333 add managefields option.

My field causing issue is "not important" so I could remove the content.

So TF will erase the content at each run (which is not really idempotent) but at least "solve" run error issue.

gabeio commented 2 years ago

it also seems that for kubernetes_manifest create_before_destroy = false (which should destroy then create, which is sometimes necessary) fails. which forces the user to manually delete, then run plan and apply.

example:

resource "kubernetes_manifest" "rate-limiting" {
  count = var.create ? 1 : 0
  manifest = {
    "apiVersion" = "configuration.konghq.com/v1"
    "kind"       = "KongPlugin"
    "metadata" = {
      "name"      = "rate-limiting"
      "namespace" = "default"
    }
    "plugin" = "rate-limiting"
  }
  lifecycle {
    create_before_destroy = false
  }
}

edits to this that require it to be replaced will fail because it tries to create then destroy which within kubernetes is not possible.

jbg commented 2 years ago

@gabeio create_before_destroy = false is Terraform's standard behaviour, so specifying lifecycle { create_before_destroy = false } has no effect on any Terraform resource.

gabeio commented 2 years ago

@jbg in certain cases like resources depending on one another create_before_destroy set to true elsewhere does cause more down the line to be true, but reading up on some issues apparently create_before_destroy set to false will not override that which explains why it didn't work.

JamesPeiris commented 2 years ago

This PR #1333 add managefields option.

My field causing issue is "not important" so I could remove the content.

So TF will erase the content at each run (which is not really idempotent) but at least "solve" run error issue.

Unfortunately for us, the field we wish to ignore is an ArgoCD application target revision SHA.

It is initially set in terraform when the ArgoCD application is created, but the field is then managed by argocd-server from that point onwards.

This allows the SHA to be updated independently of terraform (something that is a hard requirement in our scenario as we want our customers to self-serve their own application configuration changes).

Without this ignore_changes feature working, we can run the terraform stack as many times as we want until a change is made outside of terraform to the target revision field.

After this, once the terraform is re-run, we always end up with the following error:

╷
│ Error: There was a field manager conflict when trying to apply the manifest for "argocd/an-argocd-application"
│ 
│   with module.argo.kubernetes_manifest.application["an-argocd-application"],
│   on ../xxx/modules/argocd/main.tf line XX, in resource "kubernetes_manifest" "application":
│   XX: resource "kubernetes_manifest" "application" {
│ 
│ The API returned the following conflict: "Apply failed with 1 conflict:
│ conflict with \"argocd-server\" using argoproj.io/v1alpha1:
│ .spec.source.targetRevision"
│ 
│ You can override this conflict by setting "force_conflicts" to true in the
│ "field_manager" block.
╵

The solution it suggests at the bottom of the error output:

│ You can override this conflict by setting "force_conflicts" to true in the │ "field_manager" block.

Will not work for us because this would mean the terraform run would reset the application's target revision back to whatever the terraform was initially configured with, which could cause production issues and disrupt our users if they have made changes to their target revision/s.

maikelvl commented 1 year ago

I got this issue as well and fixed it using the computed_fields property:

  computed_fields = [
    "spec.source.targetRevision",
  ]

Not tested but this may also work with the finalizers:

  computed_fields = [
    "metadata.finalizers",
  ]
alyssaruth commented 1 year ago

We were seeing this with spec.replicas, because we had a manifest that was affected by an autoscaler.

Can confirm that @maikelvl 's workaround worked for us too, in our case:

computed_fields = ["spec.replicas"]
Kostanos commented 1 year ago

Hey, I'm trying:

  computed_fields = [
    "metadata.creationTimestamp"
  ]

it is not helping me. any other recommendation?

AFriemann commented 8 months ago

unfortunately the same issue with a few crds

data "http" "crds" {
  url = "https://raw.githubusercontent.com/external-secrets/external-secrets/helm-chart-0.9.11/deploy/crds/bundle.yaml"
}

data "kubectl_file_documents" "crds" {
  content = data.http.crds.response_body
}

resource "kubernetes_manifest" "crds" {
  for_each = data.kubectl_file_documents.crds.manifests

  manifest = yamldecode(each.value)

  computed_fields = [
    "metadata.annotations",
    "metadata.finalizers",
    "metadata.labels",
    "spec.conversion.webhook.clientConfig",
  ]
}

causes a diff on spec.conversion.webhook.clientConfig fields which are overwritten after creation.

          ~ spec       = {
              ~ conversion            = {
                  ~ webhook  = {
                      ~ clientConfig             = {
                          - caBundle = "..."
                          - service  = {
                              - name      = "external-secrets-webhook"
                              - namespace = "kube-secrets"
                              - path      = "/convert"
                              - port      = 443
                            }
                          - url      = null
                        } -> (known after apply)
                        # (1 unchanged element hidden)
                    }
                    # (1 unchanged element hidden)
                }
                # (5 unchanged elements hidden)
            }

causing this error on apply

╷
│ Error: There was a field manager conflict when trying to apply the manifest for "/secretstores.external-secrets.io"
│ 
│   with module.external_secrets[0].kubernetes_manifest.crds["/apis/apiextensions.k8s.io/v1/customresourcedefinitions/secretstores.external-secrets.io"],
│   on ../../../modules/k8s/external-secrets/crd.tf line 9, in resource "kubernetes_manifest" "crds":
│    9: resource "kubernetes_manifest" "crds" {
│ 
│ The API returned the following conflict: "Apply failed with 2 conflicts:
│ conflicts with \"external-secrets\" using apiextensions.k8s.io/v1:\n-
│ .spec.conversion.webhook.clientConfig.service.name\n-
│ .spec.conversion.webhook.clientConfig.service.namespace"
│ 
│ You can override this conflict by setting "force_conflicts" to true in the
│ "field_manager" block.
╵
AFriemann commented 8 months ago

with computed_fields specified to the individual field names:

  computed_fields = [
    "metadata.annotations",
    "metadata.finalizers",
    "metadata.labels",
    "spec.conversion.webhook.clientConfig.caBundle",
    "spec.conversion.webhook.clientConfig.service.name",
    "spec.conversion.webhook.clientConfig.service.namespace",
    "spec.conversion.webhook.clientConfig.service.path",
    "spec.conversion.webhook.clientConfig.service.port",
    "spec.conversion.webhook.clientConfig.url",
  ]

the plan looks like this

          ~ spec       = {
              ~ conversion            = {
                  ~ webhook  = {
                      ~ clientConfig             = {
                          ~ service  = {
                              ~ name      = "external-secrets-webhook" -> (known after apply)
                              ~ namespace = "kube-secrets" -> (known after apply)
                              ~ path      = "/convert" -> (known after apply)
                                # (1 unchanged element hidden)
                            }
                            # (2 unchanged elements hidden)
                        }
                        # (1 unchanged element hidden)
                    }
                    # (1 unchanged element hidden)
                }
                # (5 unchanged elements hidden)
            }

still the same error:

╷
│ Error: There was a field manager conflict when trying to apply the manifest for "/clustersecretstores.external-secrets.io"
│ 
│   with module.external_secrets[0].kubernetes_manifest.crds["/apis/apiextensions.k8s.io/v1/customresourcedefinitions/clustersecretstores.external-secrets.io"],
│   on ../../../modules/k8s/external-secrets/crd.tf line 9, in resource "kubernetes_manifest" "crds":
│    9: resource "kubernetes_manifest" "crds" {
│ 
│ The API returned the following conflict: "Apply failed with 2 conflicts:
│ conflicts with \"external-secrets\" using apiextensions.k8s.io/v1:\n-
│ .spec.conversion.webhook.clientConfig.service.name\n-
│ .spec.conversion.webhook.clientConfig.service.namespace"
│ 
│ You can override this conflict by setting "force_conflicts" to true in the
│ "field_manager" block.
╵

so I guess computed_fields doesn't work whenever the field is actually defined in the manifest. I'll try ignoring changes for those next :shrug:

edit: no success on ignoring changes either, tried the following without success:

  lifecycle {
    ignore_changes = [
      object.spec.conversion.webhook.clientConfig.service.name,
      object.spec.conversion.webhook.clientConfig.service.namespace,
      object.spec.conversion.webhook.clientConfig.service.path,
    ]
  }
MagicRB commented 7 months ago

Has there been any progress on this from anyone? I'm hitting

module.kubernetes.module.generated.kubernetes_manifest.default_ValidatingWebhookConfiguration_istiod-default-validator: Modifying...
╷
│ Error: There was a field manager conflict when trying to apply the manifest for "/istiod-default-validator"
│ 
│   with module.kubernetes.module.generated.kubernetes_manifest.default_ValidatingWebhookConfiguration_istiod-default-validator,
│   on .terraform/modules/kubernetes.generated/main.tf.json line 113, in resource.kubernetes_manifest.default_ValidatingWebhookConfiguration_istiod-default-validator:
│  113:       },
│ 
│ The API returned the following conflict: "Apply failed with 1 conflict: conflict with \"pilot-discovery\" using admissionregistration.k8s.io/v1: .webhooks[name=\"validation.istio.io\"].failurePolicy"
│ 
│ You can override this conflict by setting "force_conflicts" to true in the "field_manager" block.
╵

and neither

computed_fields = [
  "webhooks[\"validation.istio.io\"].failurePolicy"
];

nor

lifecycle = {
  ignore_changes = [
    "object.webhooks[\"validation.istio.io\"].failurePolicy"
  ];
};

help with this problem (nix syntax, ends up being JSON, but content is the same)

jbg commented 7 months ago

webhooks in ValidatingWebhookConfiguration is a list, not a map, so neither of those field specs will match. You have to index lists with a number (eg [0]). That does make it awkward if you don't know which position in the list the one you want to ignore is at...

MagicRB commented 7 months ago

Oh yeah forgot to test that, i did try [*] which didnt make terraform happy... interesting that in the error it seems to know what key each element corresponds to by the name attribute hm

jbg commented 7 months ago

That error message is passed through from the k8s API server, which knows to call out the element by its name field because of the patch strategy for the list (which is "merge on key name")

alexsomesan commented 7 months ago

The output snippets I'm seeing reported here point to an API-side error:

The API returned the following conflict: "Apply failed with [X] conflicts:
...

The API is signalling that there was an ownership conflict on some attributes of the resource, where both Terraform and some other actor on the cluster are trying to change the value of that attribute leading to an ambiguity about who is the authoritative source of truth for that value.

This type of error isn't resolvable using either computed_fields nor lifecycle / ignore_changes mechanisms, as the conflict occurs on the API server and not within the provider or Terraform. The error message does offer a suggestion at the very bottom to use the force_conflicts option (described here).

It helps to keep in mind that the API server tracks "ownership" of attributes as different clients attempt to make changes to them. The mechanism is described in detailed here.

Without knowing exactly what other process is trying to make changes to that resource, it's hard to offer a more pinpointed explanation, but I do hope this gets everyone on the right track.

MagicRB commented 7 months ago

I had two errors related to this, one was something in websockets[*].failuredMode that one went away when i used computed_fields. The other was metadata.createTimestamp or something like that, that was solved by just filtering out that specific attribute prior to passing it to terraform using Nix. (Sorry for the slightly vague attribute paths, I'm on my phone)

DavidGamba commented 7 months ago

@MagicRB the splat syntax (websockets[*]) didn't work for me using computed_fields. Do you mind providing some example code?

MagicRB commented 7 months ago

Splat wont work, you need to hard code the list index where the offending attribute occurs. I'll share the exact xode once I get to my laptop. (Again it'll be nix syntax but should be readable nonetheless)

llanza7 commented 6 months ago

Hey @MagicRB could you share that nix code to filter metadata.createTimestamp? I'm facing the same issue and never used nix before so it really might help.

MagicRB commented 5 months ago

@llanza7 i fear that if youve never used Nix than its not the right tool for this. I'd use jq just to delete that one attribute, but still:

  sanitizeKubernetesManifest = manifest:
    (filterAttrs (n: const (n != "status")) manifest)
    // (optionalAttrs (manifest ? "metadata") {
      metadata = filterAttrs (n: _: n != "creationTimestamp") manifest.metadata;
    });

assuming you have the k8s manifest as a nix expression, if not then

del(.metadata.creationTimestamp) | del(.status)

does the same thing but using jq

MagicRB commented 5 months ago

while that code works, i havent figured out a way to fix this for metadata.finalizers sadly