kbst / terraform-provider-kustomization

Terraform provider for Kustomize
https://www.kubestack.com
Apache License 2.0
265 stars 53 forks source link

json error during kustomizationResourceDiff #219

Open wpbeckwith opened 1 year ago

wpbeckwith commented 1 year ago

We have a terraform config like

# Other modules and terraform
... 
module "foo" {
  helm_version = "1.2.0"
...
}

module "bar" {
  # this module uses the kbst provider
  depends_on [
    module.foo.release_metdata
  ]
}

The bar module, which installs fluxcd.io, does not use the any output of the foo module, it just has a dependency on 1 output to ensure that the foo module completes 1st. Now when the terraform config is 1st created all is well. And all subsequent plans are fine too, until I update the version # in the foo module to say 1.2.1. Then the terraform plan will always fails with errors like

╷
│ Error: github.com/kbst/terraform-provider-kustomize/kustomize.kustomizationResourceDiff: github.com/kbst/terraform-provider-kustomize/kustomize.(*kManifest).load: json error: unexpected end of JSON input
│ 
│   with module.cpu_dev2_eks.module.eks_cluster_init.module.fluxcd.kustomization_resource.p0["_/Namespace/_/flux-system"],
│   on .terraform/modules/cpu_dev2_eks.eks_cluster_init.fluxcd/fluxcd/main.tf line 59, in resource "kustomization_resource" "p0":
│   59: resource "kustomization_resource" "p0" {
│ 
╵
╷
│ Error: github.com/kbst/terraform-provider-kustomize/kustomize.kustomizationResourceDiff: github.com/kbst/terraform-provider-kustomize/kustomize.(*kManifest).load: json error: unexpected end of JSON input
│ 
│   with module.cpu_dev2_eks.module.eks_cluster_init.module.fluxcd.kustomization_resource.p0["apiextensions.k8s.io/CustomResourceDefinition/_/providers.notification.toolkit.fluxcd.io"],
│   on .terraform/modules/cpu_dev2_eks.eks_cluster_init.fluxcd/fluxcd/main.tf line 59, in resource "kustomization_resource" "p0":
│   59: resource "kustomization_resource" "p0" {
│ 
╵
╷
│ Error: github.com/kbst/terraform-provider-kustomize/kustomize.kustomizationResourceDiff: github.com/kbst/terraform-provider-kustomize/kustomize.(*kManifest).load: json error: unexpected end of JSON input
│ 
│   with module.cpu_dev2_eks.module.eks_cluster_init.module.fluxcd.kustomization_resource.p0["apiextensions.k8s.io/CustomResourceDefinition/_/alerts.notification.toolkit.fluxcd.io"],
│   on .terraform/modules/cpu_dev2_eks.eks_cluster_init.fluxcd/fluxcd/main.tf line 59, in resource "kustomization_resource" "p0":
│   59: resource "kustomization_resource" "p0" {
│ 
╵
...

This continues until I set the version in the foo module back to the original 1.2.0 and then the plans will successfully complete.

pst commented 1 year ago

I need a config to reproduce the issue to look into it.

wpbeckwith commented 1 year ago

Sure, I'll see if I can slim down our current config to a reproducible case.

nazarewk commented 1 year ago

Also getting blocked by this error, I've narrowed it down to empty dm (new value for manifest) string in here: https://github.com/kbst/terraform-provider-kustomization/blob/1957b1eccaadadec5709675648a1860d67f6e56b/kustomize/resource_kustomization.go#L185-L191

nazarewk commented 1 year ago

not sure if it's way to go, have no time to dig into the root cause, but code changes in my PR fixed the issue without breaking anything (getting plan as expected)

wpbeckwith commented 1 year ago

I haven't narrowed this done to a smaller case ATM, been OOO for bereavement. However, I did discover by accident that when I tried to run a terraform plan several days later it worked. Makes me think there is some type of caching being done and after a while the cache expires and a full set of data for the diff is sent.

pst commented 1 year ago

@nazarewk what we would need to figure out is if modified being empty is an expected state in Terraform or if there is some error causing this. If there is an error that causes modified to be empty, either in the manifests themselves or somewhere else along the way, just returning nil on the diff seems like a dangerous approach.

nazarewk commented 1 year ago

I am quite sure it is caused by -> (known after apply), but need to somehow confirm it

Note I'm having a little exotic workflow:

  1. I render the kustomize without arguments to discover the object IDs
  2. I render the kustomize with patches based on resources
  3. I use pre-rendered object IDs and fully rendered manifests in a resource
code ```terraform module "prerender" { source = "../../../modules/k8s/kustomization-overlay" # can prerender, because patches won't change set of resource identifiers resources = [ "${path.module}/k8s", ] } module "source" { source = "../../../modules/k8s/kustomization-overlay" resources = [ "${path.module}/k8s", ] patches = concat(values(module.irsa.kustomize_patches), [ { target : { "kind" : "ClusterIssuer" }, patch : jsonencode([ { op = "add" path = "/spec/acme/solvers" value = [ # see https://cert-manager.io/docs/configuration/acme/dns01/route53/#creating-an-issuer-or-clusterissuer { selector = { dnsZones = [for zone_id, zone in data.aws_route53_zone.these : zone.name] } dns01 = { route53 = { region = var.aws_region } } } ] } ]) } ]) } module "kustomization-apply" { source = "../../../modules/k8s/kustomization-apply" ids_prio = module.prerender.ids_prio manifests = module.source.manifests } ```
nazarewk commented 1 year ago

actually with my "fix" it did not update the values until the next run, so while it might be caused by (known after apply) there seems to be more to it

nazarewk commented 1 year ago

I've also noticed that current version does not notice manual changes done to resources showing no diff at all.

edit: looks like it was already reported at #94

pst commented 1 year ago

I've also noticed that current version does not notice manual changes done to resources showing no diff at all.

If the change did not update the annotation yes, that's a known limitation of the provider.

nazarewk commented 1 year ago

I can confirm with 90% certainty, that the error occurs when values are -> (known after apply) - cross-checked plans for my "fixed" version and upstream

CraigStuart commented 9 months ago

I am running into this same issue. Is there a workaround I can look at?

nazarewk commented 9 months ago

not really, I used my locally built patch to work around it https://github.com/kbst/terraform-provider-kustomization/pull/221

pascal-hofmann commented 4 months ago

A workaround is to do targeted applies to resources that the broken resource depends upon. We frequently run into this issue and fixing this manually all the time is tedious. :(

pascal-hofmann commented 4 months ago

I created a repo for reproducing this issue: https://github.com/pascal-hofmann/terraform-provider-kustomization-reproduction-219

Other terraform providers treat this case like this:

    if !d.NewValueKnown("manifest") {
        // value is not known yet, so we can't diff anything.
        return nil
    }

I'm not sure how to live with the missing validation in this case, but I also wonder if CustomizeDiff is the right place for this in the first place.

Edit: Seems like it is:

    // CustomizeDiff is called after a difference (plan) has been generated
    // for the Resource and allows for customizations, such as setting values
    // not controlled by configuration, conditionally triggering resource
    // recreation, or implementing additional validation logic to abort a plan.

Source: https://pkg.go.dev/github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema