Resource block plan detail is hidden due to sensitive values since 1.4

sjparkinson commented 1 year ago

Terraform Version

% terraform version
Terraform v1.4.6
on darwin_arm64
+ provider registry.terraform.io/fastly/fastly v3.2.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/signalsciences/sigsci v1.2.14

Terraform Configuration Files

resource "fastly_service_vcl" "this" {
  name           = var.name
  stale_if_error = true
  http3          = true
  default_ttl    = var.default_ttl

  product_enablement {
    brotli_compression = false
    domain_inspector   = false
    image_optimizer    = false
    origin_inspector   = false
    websockets         = false
  }

  # This is a hack to prevent the default Fastly VCL from setting the request's backend
  # after the logic in our "Select Origin" snippet.
  condition {
    name      = "Origin Request Condition"
    statement = "false"
    type      = "REQUEST"
  }

  dynamic "backend" {
    for_each = [for origin in var.origins : {
      name              = format("origin_%s", substr(lower(origin.region), 0, 2)),
      address           = origin.host,
      port              = origin.port,
      ssl_cert_hostname = origin.tls_cert_hostname,
      ssl_sni_hostname  = origin.tls_sni_hostname,
      override_host     = origin.host_header != null ? origin.host_header : origin.host,
      healthcheck       = format("health_check_%s", substr(lower(origin.region), 0, 2)),
      shield            = origin.shield,
    }]

    content {
      name              = backend.value.name
      address           = backend.value.address
      override_host     = backend.value.override_host
      port              = backend.value.port
      ssl_check_cert    = true
      ssl_cert_hostname = backend.value.ssl_cert_hostname != null ? backend.value.ssl_cert_hostname : backend.value.override_host
      ssl_sni_hostname  = backend.value.ssl_sni_hostname != null ? backend.value.ssl_sni_hostname : backend.value.override_host
      use_ssl           = true
      healthcheck       = backend.value.healthcheck
      request_condition = "Origin Request Condition"
      shield            = backend.value.shield
    }
  }

  dynamic "healthcheck" {
    for_each = [for origin in var.origins : {
      name           = format("health_check_%s", substr(lower(origin.region), 0, 2)),
      host           = origin.host_header != null ? origin.host_header : origin.host,
      path           = origin.health_check_path,
      timeout        = origin.health_check_timeout,
      check_interval = origin.health_check_interval,
      method         = origin.health_check_method,
    }]

    content {
      host           = healthcheck.value.host
      name           = healthcheck.value.name
      path           = healthcheck.value.path
      timeout        = healthcheck.value.timeout
      check_interval = healthcheck.value.check_interval
      method         = healthcheck.value.method
    }
  }

  dynamic "domain" {
    for_each = var.domains

    content {
      name = domain.value
    }
  }

Debug Output

N/A

Expected Behavior

% terraform plan

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.example.fastly_service_vcl.this will be created
  + resource "fastly_service_vcl" "this" {
      + activate           = true
      + active_version     = (known after apply)
      + cloned_version     = (known after apply)
      + default_ttl        = 0
      + force_refresh      = (known after apply)
      + http3              = true
      + id                 = (known after apply)
      + imported           = (known after apply)
      + name               = "example.com"
      + stale_if_error     = true
      + stale_if_error_ttl = 43200

      + backend {
          + address               = "example.com"
          + auto_loadbalance      = false
          + between_bytes_timeout = 10000
          + connect_timeout       = 1000
          + error_threshold       = 0
          + first_byte_timeout    = 15000
          + healthcheck           = "health_check_eu"
          + max_conn              = 200
          + name                  = "origin_eu"
          + override_host         = "example.com"
          + port                  = 443
          + request_condition     = "Origin Request Condition"
          + ssl_cert_hostname     = "example.com"
          + ssl_check_cert        = true
          + ssl_sni_hostname      = "example.com"
          + use_ssl               = true
          + weight                = 100
        }

Actual Behavior

% terraform plan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.example.fastly_service_vcl.this will be created
  + resource "fastly_service_vcl" "this" {
      + activate           = true
      + active_version     = (known after apply)
      + cloned_version     = (known after apply)
      + default_ttl        = 0
      + force_refresh      = (known after apply)
      + http3              = true
      + id                 = (known after apply)
      + imported           = (known after apply)
      + name               = "example.com"
      + stale_if_error     = true
      + stale_if_error_ttl = 43200

      + backend {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }

Steps to Reproduce

terraform init
terraform plan

Additional Context

The expected behaviour is what we get from Terraform 1.3.9. Looks like this relates to the changes around the plan output in 1.4.

Only other difference besides version I can think of is that we set TF_IN_AUTOMATION=1 and TF_INPUT=0 in CI.

References

No response

apparentlymart commented 1 year ago

Hi @sjparkinson! Thanks for reporting this.

Can you confirm whether it's correct that there is at least one sensitive value in an argument in one of your blocks here? I'm trying to determine whether the problem you are reporting is that Terraform incorrectly reported something as sensitive when it isn't or whether the problem is that Terraform now correctly detects that something is sensitive but you'd prefer that it didn't.

Thanks!

jbardin commented 1 year ago

Taking a look at this provider's schema, the block does contain some attributes marked sensitive. I think the prior version of the renderer would only hide individual sensitive attributes, while now we are currently hiding the entire block if any attributes are sensitive.

Contrary to what the text states, the old renderer looks like it would only trigger this behavior if the entire block were marked sensitive, which is not actually possible via the schema. The new renderer adopted the same concept, but follows through this time.

sjparkinson commented 1 year ago

I think the prior version of the renderer would only hide individual sensitive attributes, while now we are currently hiding the entire block if any attributes are sensitive.

This is our experience of it.

Can you confirm whether it's correct that there is at least one sensitive value in an argument in one of your blocks here?

Of 23 attributes in the block 2 are marked as sensitive. In this instance we are not using either of those 2 attributes.

Unfortunately we would lose a lot of detail in the plan output in CI such as a change to the backend address if we were to upgrade Terraform.

apparentlymart commented 1 year ago

Thanks for the additional information, @sjparkinson!

I think this one is going to require a product management call to decide, since both the old and the new behavior cause some different problems each.

Technically the old behavior was buggy (in the sense of not behaving as designed) because it was ignoring the dynamic sensitive marking on nested blocks and relying entirely on the static sensitivity source information in the provider schema. However, it was a "known bug" in the sense that we'd become aware of it but hadn't explicitly attempted to solve it because (as we see here) it was questionable whether the intended behavior was better than the buggy behavior.

This new behavior has fixed the bug (in the sense of Terraform now behaving as designed): when a value representing a nested block is dynamically marked as sensitive, the UI renderer respects that by showing nothing inside it. However, because dynamic sensitivity tracking is inexact, in some cases like this example Terraform reaches an overly-conservative conclusion and classifies the whole block as sensitive just because a couple of attributes inside it are.

I suspect, but have not yet confirmed, that this problem is being exacerbated by backend being declared in the provider schema as being represented as a set of objects. Because the set data structure coalesces equal elements together into a single element, that coalescing behavior can potentially imply a sensitive value even without directly disclosing it, and so to be safe Terraform just considers a set to be entirely sensitive if anything inside it is sensitive. If that's true, we may not have a clear technical compromise here and so may need to decide between the old behavior that risks exposing dynamically-sensitive values in some cases vs. this new behavior which is overly-conservative but will never disclose nested dynamically-sensitive values.

jbardin commented 1 year ago

@apparentlymart, just to make sure I was clear in my summary above, the old behavior was hiding the entire block if it were dynamically marked sensitive (there are tests for that), but that's not something which is easily done (is it even possible in configuration, maybe via dynamic with a sensitive for_each?). The block values were not themselves marked as sensitive, and only contained some null sensitive attributes. The new renderer takes the literal behavior as described in the wording of the message, At least one attributes in this block is (or was) sensitive, so its contents will not be displayed and hid the entire block.

This original behavior as described in the output message never actually worked, even looking back to the very first commit. I think if there were any cases which could trigger the output, they were because of incorrect mark handling which was remedied later. Because there was effectively no way to get the output outside of tests once marks were working correctly, the text was never seen or questioned again.

With the infectious handling of marks, I think any case where the entirety of a block should be hidden because it's marked, it will be marked and hidden. I don't think there is any reason to hide an entire block from viewing only because it's schema contains a sensitive attribute. This shouldn't be different from a list or set attribute with nested structural elements that contain sensitive attributes (which is the exact same cty value underneath)-- we will show the entire structure except for the sensitive values.

The one case where I think this type of condensation could be useful is if every attribute were sensitive, but even that might be questionable since users may want to know which is and isn't set.

apparentlymart commented 1 year ago

The set situation is different because if anything in a set is marked as sensitive then the whole set just becomes sensitive, for the coalescing reason I mentioned. That is the situation where I'd expect the whole set representing all of the blocks of a given type to end up being marked as sensitive.

However, if that's not what's going on here then I retract that. It does still seem a little unclear to me what exactly has changed here, but it seems like you already dug into the details here and have an idea of what to do, do I'll duck out and let you do what needs to be done. 😀

jbardin commented 1 year ago

Oh, thank you! I missed that you confirmed that the backend is a set data structure, and was thinking of it as a list specifically because it was not being marked before. The discrepancy here is actually stemming from the fact values marked as sensitive in the schema were not being encoded as such, and only dealt with during evaluation. We only recently resolved this, which is why the original checks were never triggered (even if we keep this, the wording should probably be updated since it's only a set block type which could be hidden because of a sensitive nested attribute)

So in fact this may not be solely a rendering change, but rather a fix to the handling of sensitive values in general. I agree then with your assessment above, if sets are marked as a result of their elements then the current output is expected, and we'll have to determine if that is reasonable going forward.

jbardin commented 1 year ago

Just to record the rest of the testcase data I had open:

structural list attributes are not hidden due to sensitive nested attributes
nested list blocks are not hidden due to sensitive attributes
structural set attributes are rendered identically to blocks

So the current status is that everything appears to be working as intended, although that is a change from the previous unintended behavior.

Integralist commented 1 year ago

👋🏻

One of our support engineers just stumbled into this same issue.

They created a 'shell' of a Terraform file and then ran a terraform import of a resource and were then unable to use terraform show -no-color > service.tf to generate the rest of the Terraform because the backend (which is a TypeSet) no longer showed the rest of the attributes. He was using version 1.4.6 of Terraform and said this was never an issue before.

Considering nearly everything in our provider is a TypeSet this could be a bit of a pain for our users if they happen to be using a single sensitive value field and then can't properly import without manual steps (or in the case of @sjparkinson) they can't see a complete view of their diffs 🤔

EDIT: Interestingly, the only two sensitive fields in a backend resource are these and neither was used by this particular user so even if you don't use the sensitive fields you suffer from this 'correction' from what I can tell 😬

Integralist commented 1 year ago

Oh wow, yeah this is an issue 😄

I defined the following block and then applied it using Terraform 1.3.9:

  backend {
    address = "127.0.0.1"
    name    = "localhost"
    port    = 80
  }

I then upgraded to Terrform 1.4.0 and found when making the following change...

  backend {
    address = "127.0.0.1"
    name    = "localhost"
    port    = 80
+   use_ssl = true
  }

I can no longer see that change because the backend block happens to have two 'optional' attributes that are sensitive...

If I had actually set the sensitive attribute, then yes that should be omitted, but not the entire block surely (and on top of that I've not actually defined the sensitive value).

If I switch back to Terraform 1.3.9 then the diff is as I would expect...

What is the proposed solution here for us, is it to rewrite all affected blocks that contain Sensitive to be a different type? That would make our exposed interface quite inconsistent. Would be interested to know more about how we should progress.

Integralist commented 1 year ago

I see this was removed from the 1.5 milestone, is there any update on this?

apparentlymart commented 1 year ago

It isn't clear how to improve this. The old behavior was just ignoring dynamic sensitivity altogether and thus potentially disclosing sensitive values assigned to arguments that aren't marked as sensitive in the provider schema.

But dynamic sensitivity tracking is imprecise in the presence of sets because the coalescing behavior of sets (two equal elements get coalesced into one) means that a sensitive value being coalesced with a nonsensitive one would effectively disclose the sensitive value by implication.

I suspect that to resolve this we need to find a compromise with some custom logic that lives in the CLI UI layer. This would mean designing some heuristic to decide whether it's safe to show a diff or not, but it's unclear how to define that heuristic since set elements have no identity beyond their own value, and so we can't do logic like we do for lists and maps where we correlated the nested attributes using the element key or index to determine whether a change directly affects a sensitive value or not.

Unfortunately we cannot return to the original behavior in the meantime because the original behavior can expose sensitive values, and that security problem must take priority.

We might find that there isn't actually any better compromise here and that this is just an inherent limitation of using a set type as the representation of a particular block type, which means that the individual blocks have no identity separate from their content. We've already recommended that new providers move away from using block types and use structural argument types instead, in part because it allows using a map of objects instead of a set of objects and therefore allows correlating by map keys when diffing. It remains unclear how existing providers can transition to the new convention without breaking existing modules though, so if this does end up being the conclusion then the alternative outcome here should be to define a migration path to allow explicit deprecation of set-backed block types in favor of maps of objects.

PaulRudin commented 1 year ago

ATM terraform plan is almost unusable with some providers where things are marked as sensitive - you simply have no idea what changes would happen with the plan. I get that you can dig through the json, but that's a time consuming process (perhaps someone has some neat jq magic to render the json in a way that's easy to make sense of?).

The obvious workaround is just to downgrade to a really old version of terraform, but that doesn't seem very satisfactory either.

Integralist commented 1 year ago

I agree with @PaulRudin in that the only viable solution for most users is to downgrade to an earlier version of Terraform, as 1.4+ makes the experience very difficult otherwise.

We have had so many reports from users complaining they can't see their diffs and that they don't know or trust what will happen to their infrastructure. Hence they then have to downgrade Terraform to have more confidence, while we (the provider) scramble to migrate a very large provider to the new framework and thus away from using TypeSets.

PaulRudin commented 11 months ago

Quick and dirty script that at least shows something - suggestions welcome!

#!/usr/bin/env bash

PLANFILE=$(mktemp /tmp/plan.XXX.tfplan)
trap "rm -f ${PLANFILE}" 0 2 3 15
JSONFILE=$(mktemp /tmp/plan.XXX.json)
trap "rm -f ${JSONFILE}" 0 2 3 15
terraform plan -out=${PLANFILE}
terraform show -json ${PLANFILE} > ${JSONFILE}
readarray -t changes < <(jq --compact-output '.resource_changes[]' ${JSONFILE})
for i in "${changes[@]}"; do
    orig=$(jq --raw-output --sort-keys '.change.before|walk(if type=="array" then sort else . end)' <<< "$i")
    after=$(jq --raw-output --sort-keys '.change.after|walk(if type=="array" then sort else . end)' <<< "$i")
    sdiff <(echo "${orig}") <(echo "${after}") | colordiff | less -R
done

petr-motejlek commented 9 months ago

Hi.

I would like to express my utter disappointment with the fact that currently, the only way to actually see some kind of a diff of these values is thru JSON parsing and hoop jumping. Combined with the fact that more and more values/attributes are being marked sensitive by more and more providers and modules (usually just as a default, with no regard for the actual value I am passing in).

In general, I am OK with Terraform trying to hide what a provider developer marked sensitive (or I have marked sensitive -- we now have support to do that). I am NOT OK with the fact that unhiding it, is so damn hard (yes, I can mark a value nonsensitive, but in the cases, where a provider/module has a certain attribute marked sensitive, I am out of luck, because there is literally no way for me to override the sensitivity of their attribute).

Sure, make it a default of not showing it, but give me a straightforward builtin way of seeing the diff. Especially for cases where a provider developer decided that a good default is to mark a particular value sensitive (which is the case with digitalocean's app platform, which, while being a good default (especially considering they probably had it marked sensitive way before sensitive() and nonsensitive() were even a thing, makes it even worse, because out of 20 env vars, 1 (or none) might be the one with a a key or an actual secret, but Terraform will not show me even a single one in the diff, because they are all "sensitive", and I have no syntax or flag or anything, beyond JSON parsing, to get at the changed value).

Why can't we get a command line flag for terraform show to make it display sensitive values? (If we only hid the actual "sensitive" values, and not everything that is marked sesnsitive as some kind of default, and might not really contain any sensitive data at that point, I'd probably be fine with this. But the reality is that so many values in my Terraform code are considered sensitive by default and I have no ability to override that, so I am basically being hurt by somebody else's decision).

If a provider (such as digitalocean) marks certain attribute sensitive by default, when I supply a non-sensitive value there, it still gets marked sensitive, and I have no way of overriding them. Why, if I pass a nonsensitive value in there, is it marked sensitve, just because the provider said so, and I cannot override that?

To illustrate

resource digitalocean_app this {
  spec {
    worker {
      dynamic env {
        for_each = {...}  # this can literally contain only nonsensitive values and still, every single env block acts as if they are all sensitive
        content {
          key = env.key
          value = env.value # the value attribute is marked sensitive in their spec, and Terraform doesn't care that the actual value I supplied is nonsensitive, and converts it into sensitive, and does not give me any way of changing that, so the plan ends up not really being useful at all, and I gotta parse the JSON, just to see that I changed a stupid env var that tells my app how many threads to spawn...
          ...
        }
      }
    }
  }
}

I know you could say that DigitalOcean should remove the default sensitive mark from their code base, as people should be marking what values they consider sensitive and then pass them into the provider (and Terraform will make individual attributes sensitive based on the value passed into them). That way only the truly sensitive values will be treated as sensitive. Well. But such a change is going to brutally break backward compatibility in a very insecure way. I doubt any provider will make that change willingly.

Because of the backward compatibility and potential insecurity, I think we might need a different syntax for such cases. Maybe something like

       content {
         key = env.key
         nonsensitive(value) = env.value
       }

Or, perhaps, since 1.7.0, Terraform no longer complains about unnecessary use of nonsensitive, we might say that if I explicitly call nonsensitive() on the value where I pass it into a provider, which would mark it sensitive, I override that, and it actually becomes nonsensitive. Something like

        content {
          key = env.key
          value = nonsensitive(env.value) # since I am explicitly calling nonsensitive() here, Terraform could override the sensitiveness set by the provider and treat the attribute and its value as nonsensitive
        }

I prefer the latter approach, because that would even give me the option of deciding which env.value I wrap with nonsensitive() and which I perhaps wrap with sensitive(), like so:

         for_each = {
           SENSITIVE_VAR = {
             value = "password"
             sensitive = true
           }
           NONSENSITIVE_VAR = {
            value = "threads"
            sensitive = false
           }
         }
         content {
           key = env.key
           value = env.value.sensitive ? sensitive(env.value.value) : nonsensitive(env.value.value)
         }

Something like that, or similar. But please, make me feel as if I am in control of what is sensitive and what is not :)

Thoughts?

hashicorp / terraform