hashicorp / terraform-provider-helm

Terraform Helm provider
https://www.terraform.io/docs/providers/helm/
Mozilla Public License 2.0
998 stars 368 forks source link

manifest experiment breaks the provider with sensitive values unknown at plan time: #829

Open thomas-riccardi opened 2 years ago

thomas-riccardi commented 2 years ago

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: v1.1.4
Provider version: v2.4.1
Kubernetes version: v1.21.6-gke.1500

Affected Resource(s)

Terraform Configuration Files

provider "helm" {
  experiments {
    manifest = true
  }
  ...
}
provider "datadog" {}

resource "datadog_api_key" "gke" {
  name = "tf-gke"
}
resource "datadog_application_key" "gke" {
  name = "tf-gke"
}

resource "helm_release" "datadog_agent" {
  name       = "datadog-agent"
  repository = "https://helm.datadoghq.com"
  chart      = "datadog"
  version    = "2.30.5"

  namespace = "kube-system"

  set {
    name  = "datadog.apiKey"
    value = datadog_api_key.gke.key
  }

  set {
    name  = "datadog.appKey"
    value = datadog_application_key.gke.key
  }
}

Steps to Reproduce

  1. terraform apply

Expected Behavior

helm_release values are unknown at plan time, then are generated during apply, then the helm release is created.

Actual Behavior

When expanding the plan for helm_release.datadog_agent to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/helm" produced an invalid new value for .manifest: was cty.StringVal(...) but now cty.StringVal(...) This is a bug in the provider, which should be reported in the provider's own issue tracker.


  - In the `StringVal` there is the full manifest serialized in json. 
  - The `but now` json still contains ~`(sensitive value 54b32d658494ded3)`, so it's still not the real value that is correctly sent to the kubernetes cluster
  - The diff between `was` and `but now` is `(sensitive value xxx)` (and derived)

- a second apply works: the values are already known at plan time and don't change.

### Important Factoids
- I tried several combinations of `set`, `set_sensitive`, `nonsensitive()`; it didn't change anything: the plan-time value was ~`(sensitive value 54b32d658494ded3)`
- removing the `manifest` experiment fixes everything (but it was a useful feature to understand what really changes in our plan)
- no need for a datadog account:
  - it can probably be reproduced with a random string resource as helm_release value
  - the datadog helm chart takes the apiKey value and puts it in a k8s Secret (as well as compute a hash of the Secret yaml to put in another k8s resource annotation, but it should not matter)

### References
- GH-711 although the initial comment doesn't talk about `manifest`; just a subsequent comment (

### Community Note
<!--- Please keep this note for the community --->
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the community and maintainers prioritize this request
* If you are interested in working on this issue or have submitted a pull request, please leave a comment
mmerickel commented 2 years ago

I'm experiencing this with the aws-load-balancer-controller. It dynamically generates a tls keypair for a webhook and stores it as a secret. Later applies then use the helm lookup function to query the existing value from the cluster and reuse it. Something about that interaction is failing here such that everywhere that the tls cert is used in the chart is showing as changed on every apply and then failing with the above error "when expanding the plan ...".

andremarianiello commented 2 years ago

This issue seems to be caused by the use of a mock lookup function https://github.com/hashicorp/terraform-provider-helm/blob/1f32cc8313baa58a076bd77b6864a1271ae01438/vendor/helm.sh/helm/v3/pkg/engine/funcs.go#L67-L69

nitrocode commented 1 year ago

@andremarianiello Hmm how can the lookup function be modified to avoid this error ? How do other providers do a lookup for sensitive values?

github-actions[bot] commented 6 months ago

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

nitrocode commented 6 months ago

Unstale

tadeha commented 3 months ago

Any update on this?