Open bflad opened 2 years ago
Thanks @bflad!
Do you have any examples of non-legacy providers with issues? The protocol shims were not sufficient to hide the legacy SDK behavior entirely, so IIRC there are some heuristics in the diff rendering to ignore these minor differences too. There might be some way to plumb the "legacy" flag through to the final rendering, but maybe some examples would show an easier workaround?
Hi @jbardin 👋 Could you redescribe what you are looking for? Do you mean start to finish reproduction with the protocol flag never involved? I'm not sure if I have one of those readily available. I was raising this to prevent confusion with migrations and slight differences that might arise.
The reproduction here and some of the similar references are for the random provider, which was migrated off the legacy type system flag in version 3.4.0.
From an out-of-band conversion, one thing I didn't say out loud here is the schema changes that occurred here:
So the missing plan difference was specifically a null configuration and Computed difference.
Thanks, that makes sense now why it was not reported as an error during the plan. The existing plan renderer already needs some extensive work, so we can probably plan to consider this type of change when we get there. Since the CLI will have to deal with legacy providers for quite a while, defaulting to hiding these minor differences may take precedence, but since this shouldn't be a common occurrence, perhaps an option like #27547 to show the plan in complete detail would suffice.
In case it's useful context for someone who works on this in future, I believe this is the current location of the legacy SDK workaround that the view layer implements (as opposed to the one implemented in Terraform Core itself):
The comment here suggests that we did at one point imagine there being a way to handle this in Terraform Core (where it would have access to the "legacy type system" flag) but did this simpler solution to reduce the already-high risk for the v0.12 release. I don't actually remember what that other way was though, so I think we'll have to design it again from scratch if we want to address it now.
Interestingly, the current implementation of NormalizeObjectFromLegacySDK
seems to focus only on normalizing the representation of nested blocks, which I don't think explains why override_special
(which is an attribute of type string) would be subject to this normalization. I guess that means there's some other normalization going on somewhere else.
I'm not sure if it matters much, but https://github.com/hashicorp/terraform-provider-tls/issues/284 is an example of the opposite issue where ""
to null
is missing from CLI terminal plan output for a provider without the legacy type system flag:
terraform {
required_providers {
tls = {
source = "hashicorp/tls"
version = "3.4.0"
# version = "4.0.3" # upgrade after initial apply to show missing plan output
}
}
}
resource "tls_private_key" "example" {
algorithm = "ECDSA"
}
resource "tls_self_signed_cert" "example" {
private_key_pem = tls_private_key.example.private_key_pem
is_ca_certificate = true
subject {
organization = "Example"
}
# 3 Years
validity_period_hours = 24 * 365 * 3
allowed_uses = [
"cert_signing",
"crl_signing",
]
}
$ terraform init
$ terraform apply
$ terraform init -upgrade
$ terraform apply
...
~ subject { # forces replacement
# (1 unchanged attribute hidden)
}
59,61c59,61
< "common_name": "",
< "country": "",
< "locality": "",
---
> "common_name": null,
> "country": null,
> "locality": null,
63,66c63,66
< "organizational_unit": "",
< "postal_code": "",
< "province": "",
< "serial_number": "",
---
> "organizational_unit": null,
> "postal_code": null,
> "province": null,
> "serial_number": null,
These attributes are Optional: true
only within a block. Please let me know if you'd prefer a separate issue for tracking.
It looks like this is causing confusion when these "hidden" value differences are underneath collection types as well: https://discuss.hashicorp.com/t/plugin-framework-state-upgraders/47699
For reference, the new renderer inherited the old behavior which is now implemented here.
@bflad, do you have any example of particularly noisy resource configurations which we could use to evaluate how much output this still suppresses? Now that providers are migrating to the framework where these can be meaningful diffs, I wonder if it's time to just accept the inconvenience that SDK resources might occasionally show larger than expected diff output.
I'm not sure about "noisy" resource configurations, but this is trivial to reproduce via the built-in terraform_data
managed resource. I'm guessing we would want this behavior conditionalized on the legacy type system flag, if possible, to limit the plan renderer causing much more noise with the legacy SDK.
terraform {
required_version = "~> 1.7"
}
resource "terraform_data" "test" {
# apply like this, then comment or set to null
input = ""
}
e.g.
$ terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
+ create
Terraform will perform the following actions:
# terraform_data.test will be created
+ resource "terraform_data" "test" {
+ id = (known after apply)
+ output = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
terraform_data.test: Creating...
terraform_data.test: Creation complete after 0s [id=f95b13bf-5a6b-f206-3604-c97a8f4c7c5c]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
### comment/nullify input
$ terraform plan
terraform_data.test: Refreshing state... [id=f95b13bf-5a6b-f206-3604-c97a8f4c7c5c]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
~ update in-place
Terraform will perform the following actions:
# terraform_data.test will be updated in-place
~ resource "terraform_data" "test" {
id = "f95b13bf-5a6b-f206-3604-c97a8f4c7c5c"
+ output = (known after apply)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Hi @jbardin,
Currently with the provider we are developing with plugin framework explicit empty strings are not displayed correctly in plan, see https://github.com/hashicorp/terraform-plugin-framework/issues/921 for the configuration and plan output details. From plugin framework side the issue has been closed and not planned, but I was forwarded upstream to this issue.
From @bflad I understood that this is something this code base sees as non-trivial and is not something that will be pursued, what is the reasoning behind this? Reason I am asking is that to me it seems that explicit empty string in plan output is not working as it should.
Would it be possible to reconsider making changes to plan renderer that would fix this specific case?
Hi @akinross, Unfortunately there is no absolute way to differentiate between changes which were intentional, and changes which were because of a misbehaving provider. Completely removing this bit of shim code for legacy provider compatibility would be nice, but it's not something we can do until most of those resources are updated from the SDK to fix the behavior.
In the meantime I have a minor proposal which can detect some cases where the output can be displayed, and we can evaluate if that complexity is worth adding for the partial solution it offers.
Hi @jbardin, Thank you for your response. I understand that you do not want to introduce breaking change regarding legacy providers. Completely removing would impact our provider also, since we are still in a muxed state and working on migration of all the resources. Would it be a possibility to detect in the rendering the type of provider "legacy SDK" vs "plugin framework", to differentiate to allow both to render properly?
@akinross, the changes are not recorded with any indication of which SDK the provider was using (both internally, and in the serialized plan), so the renderer has no way to differentiate the changes.
Terraform Version
Terraform Configuration Files
Debug Output
Can provide if needed.
Expected Behavior
Either no plan difference output or a plan difference output including every attribute change triggering the difference for providers not using the legacy type system flag in the protocol.
Actual Behavior
Steps to Reproduce
terraform init
with version 3.3.2terraform apply
terraform init -upgrade
with version 3.4.2terraform plan
The issue references also contain a few other reproductions of similar behaviors.
Additional Context
Terraform Cloud UI seems like it was actually able to pick up the "addition" of the attribute being set to
""
from null, so I suspect this is just a CLI plan output issue.We understand why the provider bug occurred, it was an issue on our side of the protocol in the provide code. Inspecting the differences in state after applying the "unexpected" plan showed the change:
Aside: terraform-plugin-sdk also would do some potentially unexpected things with what it decided to store in state (empty vs null), especially with imported resources and with maps that have null values, so there's other similar reports we've been working through with migrating providers from that to terraform-plugin-framework. This isn't core's fault, however I'm saying this out loud in case there's other bug reports that wind up in this issue tracker about "strange" plans as folks migrate.
References