Provider produced an invalid new value for .diff_output

mnatan commented 4 years ago

terraform-provider-helmfile version: v0.3.14 helmfile version: helmfile-0.125.5.high_sierra.bottle.tar.gz

Overview

Quite often we can't create larger helm charts due to inconsistent diff_output being generated by plan and apply.

When comparing the generated diffs, it usually comes down to the order of manifests being presented. When deploying a chart to a fresh environment with more than 15 manifests, it becomes impossible to deploy, as it fails every time.

Additionally, the issue gets worse when helmfile config contains multiple releases. The order of these releases tends to vary more often in the diff output.

Example output

When expanding the plan for module.jenkins-v2.helmfile_release_set.helmfile_release to include 
new values learned so far during apply, provider "registry.terraform.io/mumoshu/helmfile" produced 
an invalid new value for .diff_output: was cty.StringVal(...) but now cty.StringVal(...).

Already discussed here

resource_release_set.go#L518
Similar issue: https://github.com/mumoshu/terraform-provider-helmfile/issues/22

Possible solutions

Maybe it is possible to ignore the diff generated during apply, and just trick terraform into thinking the two stages produce the same output?
It seems like you have an issue with terraform apply running the diff part twice. How about caching the first run in the module's global variable, and then later returning the same result?

mumoshu commented 4 years ago

@mnatan Thanks for reporting!

I thought this has been fixed by https://github.com/mumoshu/terraform-provider-helmfile/commit/2460c9affc90b84b82c532be2700ad87691f32ce#diff-769700fba47e9257c615acae835ad9d3 but apparently I need to take a deeper look.

Regarding the possible solution proposed by you - thanks anyway - but I think that's what the provider does today. In nutshell it uses a sha256sum computed from the output of helmfile template as the cache key so that as long as the template output doesn't change, the same helmfile diff output is used for diff_output.

It was originally using helmfile build output for computing the cache key. But we moved to helmfile template because helmfile build was unable to detect changes to values.yaml files referenced from the helmfile.yaml.

mumoshu commented 4 years ago

Just curious, but is your helmfile template output stable? Probably running it a few times and taking diff --unified between outputs could help.

mnatan commented 4 years ago

Many thanks @mumoshu for good explanation. Because of that, we were able to locate why the cache you implemented was not working for us.

We are using Jenkins helm chart, and it randomly generates a pod name for testing. See here.

As you expected, it causes the helmfile template to always give a different result.

We disabled this test for now and everything works perfectly.

Before we close this issue, shall we handle this case somehow? I am thinking about running the helmfile template twice and comparing the outputs. If the user provides an unstable template maybe we should log a reasonable error message?

mumoshu commented 4 years ago

@mnatan Thank you so much for confirming! Glad to hear it worked.

Before we close this issue, shall we handle this case somehow?

I believe so - Random pod names in hooks sounds perfectly valid use-cases to me.

I have two options in my mind, but not sure which one is best:

Enhance helmfile build so that it reads, merges, and embeds all the values.yaml files referenced from the helmfile.yaml into the helmfile build output, so that taking the hash value should be as unique ashelmfile template`.
Enhance helmfile template to optionally skip hooks and use that, so that the hash value won't change due to random pod names in hooks.

I slightly prefer the latter. But I'm not exactly if that's enough. Can we safely say that only K8s resources created by hooks contain random values?

mnatan commented 4 years ago

Enhance helmfile build so that it reads, merges, and embeds all the values.yaml files referenced from the helmfile.yaml into the helmfile build output, so that taking the hash value should be as unique as helmfile template`.

How does that fix the issue? If we generate a random pod name, enhanced helmfile build will still be different every time.

Enhance helmfile template to optionally skip hooks and use that, so that the hash value won't change due to random pod names in hooks.

Can we safely say that only K8s resources created by hooks contain random values?

I do not think so. Any random value will cause this issue and I think people could easily use it outside hooks as well.

I think we can either:

Admit that the current implementation does not work with random values and warn the user when we spot such situation
Try to support randomized charts by somehow comparing two different helmfile template runs and trying to ignore the randomized part for the purpose of the hash generation.

mumoshu commented 4 years ago

@mnatan Thanks for confirming!

If we generate a random pod name, enhanced helmfile build will still be different every time.

No. helmfile build is for generating a "flattened"` helmfile.yaml that after rendering go template. After the enhancement proposed above it would also contain values.yaml contents, too. The output is not K8s manifests but can be thought of a collection of release names, chart names, and values to produce K8s manifests. So they don't contain random values generated at the chart rendering time(=helm template). That's why I thought it would solve the issue.

mnatan commented 4 years ago

Oh, I see.

This is the best idea in my opinion then. 👍

mumoshu commented 4 years ago

The progress: https://github.com/roboll/helmfile/pull/1436

mumoshu commented 4 years ago

@mnatan Thanks again for reviewing the helmfile pr!

I just released v0.4.0 for this. With helmfile v0.126.0 or greater installed, the provider now uses helmfile build --embed-values as the stable and reliable desired state for unique id calculation.

Would you mind trying it?

mnatan commented 4 years ago

Hi @mumoshu, I'll try this out this week.

I assume v0.4.0 breaks with older versions of helmfile? Maybe the provider should check the version of helmfile and warn about this incompatibility?

mumoshu commented 4 years ago

@mnatan Thanks! The provider does check the helmfile version to determine if it should use helmfile build --embed-values or not.

mumoshu commented 3 years ago

https://github.com/mumoshu/terraform-provider-helmfile/commit/389eee7d34b0bd57b8f5b361d98ef7e78822ca33 is the final solution for this.

Before this commit, the provider was still emitting this error when you used hemfile's kustomize integration or values.yaml.tmpl, which uses a random directory to store temporary charts and values files generated by helmfile, whose random files paths are shown in diff_output which breaks terraform.

mumoshu commented 3 years ago

Note that the above enhancement requires the provider v0.12.0 or greater and helmfile v0.136.0 or greater that includes https://github.com/roboll/helmfile/pull/1622

yashbhutwala commented 3 years ago

Similar to #56

pwillie commented 2 years ago

Hi @mumoshu, firstly thanks for the great work. We are running 0.14.1 provider with v0.142.0 but still seeing the same issue ie .diff_output: was cty.StringVal(""), but now cty.StringVal("..."). Is there a workaround by any chance?

mumoshu / terraform-provider-helmfile