Open rileykarson opened 5 years ago
Hi @rileykarson,
Looking at the schema for that resource, access
is defined as a "set", which is an unordered data structure. If the existing and planned set sets contain identical values, then the plan output will show only the minimal changes. I'm guessing that there's some small difference in the internal structure of the first OWNER
value (probably a zero vs null difference) which causes it to show in the diff because the values cannot be determined to be equal.
The referenced issue mentions that the order matters. If that's the case, then the set data type isn't appropriate here and we should migrate access
to a list. If the order doesn't matter, can you verify if this has any impact on the resource update itself?
Thanks!
access
being TypeSet
is correct. I don't think this is a problem with zero vs null values. If I apply the same config twice, there's no diff, indicating that the underlying hash value hasn't changed. It appears set hash values are hidden even in debug logs, so I can't confirm that through any other means.
Ordering of elements has no effect. If I replace my second config with the following:
resource "google_bigquery_dataset" "temp" {
dataset_id = "temp"
default_table_expiration_ms = "2419200000"
access {
user_by_email = "tf-test-2174102886934250970@graphite-test-rileykarson.iam.gserviceaccount.com"
role = "OWNER"
}
access {
role = "OWNER"
user_by_email = "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"
}
}
I get this plan:
- access {
- role = "OWNER" -> null
- user_by_email = "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com" -> null
}
+ access {
+ role = "OWNER"
+ user_by_email = "tf-test-2174102886934250970@graphite-test-rileykarson.iam.gserviceaccount.com"
}
- access {
- role = "READER" -> null
- user_by_email = "tf-test-2174102886934250970@graphite-test-rileykarson.iam.gserviceaccount.com" -> null
}
+ access {
+ role = "OWNER"
+ user_by_email = "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"
}
Thanks for confirming @rileykarson.
The provider shims do some normalization so that the small differences are ironed out when there is no overall change to the resource, but set values can't always be normalized in such a manner, because modifying the set value changes its identity. This is likely the reason (and really the only explanation) this part of the plan output:
- role = "OWNER" -> null
- user_by_email = "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com" -> null
}
+ access {
+ role = "OWNER"
+ user_by_email = "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"
}
Chances are that this is from the nested view
list block, though there's not much that can be done to improve it with the current SDK. That in itself is harmless, as the values should ends up identical once they reaches the provider. This is usually only a UX problem, since it makes the diffs of sets much harder to read. We may be able to "relax" the diff output a bit more to not show visually identical values in the plan output.
I was curious about
Chances are that this is from the nested view list block, though there's not much that can be done to improve it with the current SDK.
Since the Google provider generally stores lists in state as the empty list when nil
instead of not storing them at all, I tried manually editing state and running terraform plan -refresh=false
, it appears that these produce identical plans:
Original:
"access": [
{
"domain": "",
"group_by_email": "",
"role": "OWNER",
"special_group": "",
"user_by_email": "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com",
"view": []
},
{
"domain": "",
"group_by_email": "",
"role": "READER",
"special_group": "",
"user_by_email": "tf-test-2174102886934250970@graphite-test-rileykarson.iam.gserviceaccount.com",
"view": []
}
],
Modified:
"access": [
{
"domain": "",
"group_by_email": "",
"role": "OWNER",
"special_group": "",
"user_by_email": "tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"
},
{
"domain": "",
"group_by_email": "",
"role": "READER",
"special_group": "",
"user_by_email": "tf-test-2174102886934250970@graphite-test-rileykarson.iam.gserviceaccount.com"
}
],
Yup, it's not the view
block (though I'm not sure editing the state would show the diff, since blocks are not allowed to be null and it may be reified by when creating the proposed change). It's the fact that the legacy sdk doesn't reliably differentiate between an unset string and an empty string. You can see in the state the resource returned empty strings for for some of the fields (probably not the fault of the resource, just a limitation of the old SDK).
Internally the state looks like
cty.ObjectVal(map[string]cty.Value{
"domain":cty.StringVal(""),
"group_by_email":cty.StringVal(""),
"role":cty.StringVal("OWNER"),
"special_group":cty.StringVal(""),
"user_by_email":cty.StringVal("tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"),
"view":cty.ListValEmpty(cty.Object(map[string]cty.Type{"dataset_id":cty.String, "project_id":cty.String, "table_id":cty.String}))
})
while the proposed value from the config looks like:
cty.ObjectVal(map[string]cty.Value{
"domain":cty.NullVal(cty.String),
"group_by_email":cty.NullVal(cty.String),
"role":cty.StringVal("OWNER"),
"special_group":cty.NullVal(cty.String),
"user_by_email":cty.StringVal("tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"),
"view":cty.ListValEmpty(cty.Object(map[string]cty.Type{"dataset_id":cty.String, "project_id":cty.String, "table_id":cty.String}))
})
@jbardin @rileykarson any proposed solution here? I'm not sure what fixing "the legacy SDK" might involve.
This is a primary blocker for us migrating off of TF 0.11. We had a workaround for this with 0.11 by tweaking landscape to treat all sets as unordered. Landscape doesn't support 0.12 and they have no plans to. Our diffs are too large to read due to the length of some of the lists in our resources that are modified frequently.
Could a workaround be to modify the provider or terraform CLI to write empty strings as null values?
Is this a problem with legacy SDK? If so, we are using new terraform ( v0.12.24) and later google-provider(v3.19.0) and still facing the exact problem. Is there any workaround currently? Our access list can be quite long and it is impossible to read the plan output for our Bigquery dataset.
I resorted to hacky scripting and my solution makes me so so sad, but allowed us to be functional with 0.12: https://gist.github.com/dharamsk/19cefd0438d331a26accd1d4795349d7
I really hope a path forward for a fix here can be identified. I still don't understand where (if possible) I should make a PR to have the SDK line up with the state file when it comes to blank strings vs nulls.
I'm from the BQ TF team and found this issue as it was recently mentioned again for google_bigquery_dataset
. At a glance this seems to be best with a SDK-level fix for schema.TypeSet
? What's the current plan?
Hi @wj-chen, the legacy SDK is essentially frozen and cannot be changed without breaking legacy providers. The Terraform Plugin Framework allows full access to the modern Terraform type system and can handle these types of changes between null
and zero values.
Thank you @jbardin. @rileykarson - what would be the next step for our providers that are impacted by this?
We can cover this in a Google provider-specific issue / internal followup! Feel free to open a new issue if the one you got here from has been closed.
Adopting the plugin framework (https://github.com/hashicorp/terraform-plugin-framework) over the SDK (https://github.com/hashicorp/terraform-plugin-sdk) is the answer, but it'll be a long/difficult migration for the provider as a whole. We're driving that on the core Google provider team.
Yup, it's not the
view
block (though I'm not sure editing the state would show the diff, since blocks are not allowed to be null and it may be reified by when creating the proposed change). It's the fact that the legacy sdk doesn't reliably differentiate between an unset string and an empty string. You can see in the state the resource returned empty strings for for some of the fields (probably not the fault of the resource, just a limitation of the old SDK).Internally the state looks like
cty.ObjectVal(map[string]cty.Value{ "domain":cty.StringVal(""), "group_by_email":cty.StringVal(""), "role":cty.StringVal("OWNER"), "special_group":cty.StringVal(""), "user_by_email":cty.StringVal("tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"), "view":cty.ListValEmpty(cty.Object(map[string]cty.Type{"dataset_id":cty.String, "project_id":cty.String, "table_id":cty.String})) })
while the proposed value from the config looks like:
cty.ObjectVal(map[string]cty.Value{ "domain":cty.NullVal(cty.String), "group_by_email":cty.NullVal(cty.String), "role":cty.StringVal("OWNER"), "special_group":cty.NullVal(cty.String), "user_by_email":cty.StringVal("tf-153@graphite-test-rileykarson.iam.gserviceaccount.com"), "view":cty.ListValEmpty(cty.Object(map[string]cty.Type{"dataset_id":cty.String, "project_id":cty.String, "table_id":cty.String})) })
For anyone that discovers this down the line, the latest provider introduces iam_member
field which also needs to be fed an empty string to mute the access block diff:
access {
role = "OWNER"
special_group = "projectOwners"
domain = ""
group_by_email = ""
user_by_email = ""
iam_member = ""
}
Terraform Version
Terraform Configuration Files
Debug Output
https://gist.github.com/rileykarson/830f562dc9c735f4e491777eea4c4a02
Expected Behavior
Actual Behavior
Steps to Reproduce
terraform apply
Additional References
See https://github.com/terraform-providers/terraform-provider-google/issues/3929