Closed kesompochy closed 2 weeks ago
Hi @kesompochy!
I tried to replicate this issue with the following configuration similar to the one you are sharing, but the result was successful without errors. The etag
didn't change and there aren't new policyTags
:
resource "google_bigquery_dataset" "bq_ds_19485" {
dataset_id = "bigquery_dataset_19485"
friendly_name = "bigquery-dataset-19485"
description = "something"
location = "EU"
default_table_expiration_ms = 3600000
labels = {
env = "default"
}
}
resource "google_bigquery_table" "bq_table_19485" {
dataset_id = google_bigquery_dataset.bq_ds_19485.dataset_id
deletion_protection = false
table_id = "bq-table-19485"
project = "my-project"
schema = <<EOL
[
{
"name": "time",
"type": "TIMESTAMP",
"mode": "NULLABLE"
}
]
EOL
}
google_bigquery_table
from the tfstate
file to import it with this new code:resource "google_bigquery_dataset" "bq_ds_19485" {
dataset_id = "bigquery_dataset_19485"
friendly_name = "bigquery-dataset-19485"
description = "something"
location = "EU"
default_table_expiration_ms = 3600000
labels = {
env = "default"
}
}
resource "google_bigquery_table" "bq_table_19485" {
dataset_id = google_bigquery_dataset.bq_ds_19485.dataset_id
deletion_protection = false
table_id = "bq-table-19485"
project = "my-project"
schema = <<EOL
[
{
"name": "time",
"type": "TIMESTAMP",
"mode": "NULLABLE"
}
]
EOL
}
import {
to = google_bigquery_table.bq_table_19485
id = "projects/my-project/datasets/bigquery_dataset_19485/tables/bq-table-19485"
}
google_bigquery_table.deletion_protection
from false
to true
Hi @ggtisc!
I believe this issue is caused by the inclusion of "policyTags": {}
when terraform apply
is executed.
The eTag change may not occur when applying a deletion_protection
change to a resource imported from one created by terraform apply
. To replicate the issue, could you please use a BigQuery table resource created by the bq
command and import it into Terraform?
I'd like to note that this issue primarily affects resources created outside of Terraform. As such, it may not require immediate action. Thank you for your attention and response to this matter.
@kesompochy could you show us how are you attaching the existing google_bigquery_dataset
in the bq command
?
If it is possible share the full command you are using without sensitive information
Here is my configurations and commands. There is no error, but the eTag changes after following these steps.I believe this behavior is caused by the addition of empty policyTags in the API request.
provider "google" {
project = var.project_id
region = "us-central1"
}
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.80.0"
}
}
}
resource "google_bigquery_dataset" "my_dataset" {
dataset_id = "my_dataset"
project = var.project_id
}
$ terraform apply
$ bq mk --table my-project:my_dataset.test_table time:TIMESTAMP
resource "google_bigquery_table" "test_table" {
dataset_id = google_bigquery_dataset.my_dataset.dataset_id
deletion_protection = true
table_id = "test_table"
expiration_time = "0"
project = var.project_id
schema = <<EOL
[
{
"name": "time",
"type": "TIMESTAMP"
}
]
EOL
}
import {
to = google_bigquery_table.test_table
id = "projects/my-project/datasets/my_dataset/tables/test_table"
}
$ terraform apply -var-file="terraform.tfvars"
$ terraform state show google_bigquery_table.test_table
resource "google_bigquery_table" "test_table" {
dataset_id = google_bigquery_dataset.my_dataset.dataset_id
deletion_protection = true # modified
table_id = "test_table"
expiration_time = "0"
project = var.project_id
schema = <<EOL
[
{
"name": "time",
"type": "TIMESTAMP"
}
]
EOL
}
$ terraform apply -var-file="terraform.tfvars"
$ terraform state show google_bigquery_table.test_table
Hi @kesompochy!
After replicating this step by step it is true that the etag
changes in the tfstate
file. but this is just a version control mechanism and it doesn't affect the resources. On the other hand there weren't added policy tags in any of these steps.
Terraform uses the etag
to determine if a table has been modified outside of its control. If other resources depend on the google_bigquery_table
, changes to the etag
might trigger updates to those dependent resources as well. It is important to consider that while the etag
is generally a reliable indicator of changes, it's not always foolproof. In some cases, minor updates or transient conditions might not result in an etag
change.
In conclusion the fact that the etag
changes is considered normal behavior since the deletion_protection
argument was changed from true
to false
Hi @ggtisc!
Thank you for replicating my steps.
I agree with your conclusion. I understand that the etag
change is uncontrollable from Terraform's perspective.
However, I'm still concerned about the behavior where unexpected empty policyTags
are sent in the API request, even when the table schema hasn't changed after importing. I believe the change in policyTags
doesn't appear in the Terraform state, but it does occur in the API request. The table's etag
changes, likely because the addition of empty policyTags
causes the remote resource to be treated as if it has been modified.
Could we improve the behavior so that empty policyTags
are not added if policyTags
are not explicitly set in the Terraform configuration?
I'm aware of related PR https://github.com/hashicorp/terraform-provider-google/pull/15547 regarding the behavior of policy tags. I understand that any changes to policy tag behavior would need to be consistent with the context of these issues. The current behavior might be intentional, as discussed in the issue https://github.com/hashicorp/terraform-provider-google/issues/14047 .
Additionally, although this is not directly related to the policyTags
issue, I was wondering if it's possible to avoid sending API requests when only the deletion_protection
attribute is changed. I consider deletion_protection
to be a feature managed solely by Terraform. If we could avoid sending API requests to Google Cloud for attributes that are managed exclusively by Terraform, I believe it could contribute to performance improvement.
After some tries I can't reproduce this issue. It looks like it is intermittent. I'm forwarding it for a deeper review
The current behavior might be intentional, as discussed in the issue https://github.com/hashicorp/terraform-provider-google/issues/14047 .
Yes, treating the absence of policyTags
in the config file as "policyTags": {}
in the API request is intentional to support unsetting policy tags for a table. The API treats the absence of policy tags in an API request as a no-op, so for a table that previous has policy tags attached, if the user wants to now clear them, they need to explicitly send "policyTags": {}
in the API request. In Terraform, that may be done by omitting the policyTags
attribute in the config file. This means for a table that never has policy tags specified in the schema, Terraform still sends "policyTags": {}
to the API.
There may be some options to mitigate the side effect that causes the behavior you are reporting, but I'd like to understand more the impact of the changing table etag for your use case. Can you elaborate on how bq query --replace
plays a role here?
If we could avoid sending API requests to Google Cloud for attributes that are managed exclusively by Terraform, I believe it could contribute to performance improvement.
I'm not aware of a mechanism that allows for this today, but the idea has merit and I'll bring it up for discussion.
Thank you for explaining the purpose and details regarding the empty policy tag behavior.
I'd like to share an unexpected scenario I encountered:
bq query --replace
job runs to generate data for several projects.bq query --replace
workflow into Terraform.deletion_protection
setting of the table in my Terraform configuration, which resulted in a change to the table's etag.bq query --replace
workflow altered the schema column names from lowercase to uppercase, as the bq query --replace
command specified them in uppercase. I suspect that bq query --replace
hadn't modified the schema before the table was imported into Terraform, likely because the etag matched the one from when the table was initially created. Please note that this is my hypothesis, as we cannot definitively determine the bq
command's internal mechanism due to its closed-source nature.I understand that this is a highly specific scenario that likely affects very few users.
I'm not aware of a mechanism that allows for this today, but the idea has merit and I'll bring it up for discussion.
I appreciate your time and consideration. Please let me know if you think this issue is worth addressing, or if there's a better way I can contribute to improving this aspect of the provider. I understand that this may not be a high priority, but I believe fixing it could prevent unnecessary API calls and potential unexpected behaviors in similar scenarios.
The following day, the bq query --replace workflow altered the schema column names from lowercase to uppercase
Thanks for sharing the additional details. I'm still trying to understand how the diff detection finds a case mismatch in the column name. Is it that the table created using bq query --replace
has upper case name in the schema (e.g. TIME
as mentioned in the original post), but the Terraform config has lower case name in the schema (e.g. time
as mentioned in the original post). If so, is there a reason why the two schema specs are inconsistent? What if you changed the Terraform config schema to also be upper case?
as we cannot definitively determine the bq command's internal mechanism due to its closed-source nature.
I happened to work on the bq
commands too. As part of a query --replace
command, there is no reading the eTag on the client side. The CLI simply gathers the input a parameters and makes a jobs.insert
request with the corresponding parameters. I don't have knowledge on what happens on the backend when handling that request and whether eTag is involved there though.
I believe fixing it could prevent unnecessary API calls and potential unexpected behaviors in similar scenarios.
I just heard back and there does appear to exist a way to make this optimization today! It was added to the documentation as part of https://github.com/GoogleCloudPlatform/magic-modules/pull/11330, and I'll look into applying that to google_bigquery_table
, which might solve your issue. It's generally a good-to-have anyways. Please continue to follow this issue for updates.
Thank you for your investigation, @wj-chen.
It appears the bq query --replace
command does not fix the schema case mismatch as described in https://github.com/hashicorp/terraform-provider-google/issues/19485#issue-2528412374.
When a table is created by bq query --replace
with a lowercase field, subsequent bq query --replace
commands ignore the schema mismatch.
$ bq query --replace --use_legacy_sql=false --destination_table=my_dataset.test_table 'SELECT time AS time FROM my_dataset.source_table'
Waiting on bqjob_r4c9ceead521c2858_00000191f780dca2_1 ... (2s) Current status: DONE
+---------------------+
| time |
+---------------------+
| 2024-09-16 06:57:34 |
+---------------------+
$ bq query --replace --use_legacy_sql=false --destination_table=my_dataset.test_table 'SELECT time AS TIME FROM my_dataset.source_table'
Waiting on bqjob_r3f95ae4e67f52d74_00000191f78135af_1 ... (1s) Current status: DONE
+---------------------+
| time |
+---------------------+
| 2024-09-16 06:57:34 |
+---------------------+
However, after Terraform modifies the table configuration, bq query --replace
then changes the field case.
$ terraform apply
$ bq query --replace --use_legacy_sql=false --destination_table=my_dataset.test_table 'SELECT time AS TIME FROM my_dataset.source_table'
Waiting on bqjob_r4eca9f9d94559614_00000191f77f8929_1 ... (1s) Current status: DONE
+---------------------+
| TIME |
+---------------------+
| 2024-09-16 06:57:34 |
+---------------------+
This suggests that the issue is not related to the Terraform config schema case itself.
I'll look into applying that to google_bigquery_table, which might solve your issue.
That's excellent news! Thank you very much for your efforts on this matter.
Updating deletion_protection
should no longer trigger an Update API request once https://github.com/GoogleCloudPlatform/magic-modules/pull/12174 is released in the provider v6.11.0 (in ~2w). If you continue to have problems, please reopen this or create a new issue.
Thanks @wj-chen for working on this! Really looking forward to using v6.11.0 with the improved functionality!
Community Note
Terraform Version & Provider Version(s)
Terraform v1.9.5 on x86_64 GNU/Linux
Affected Resource(s)
google_bigquery_table
Terraform Configuration
Debug Output
Please see "Actual Behavior".
Expected Behavior
When importing a BigQuery table that was created using the bq command and then modifying an unrelated attribute (e.g., deletion_protection), the provider should not add empty policyTags to fields that didn't originally have them. The resource's eTag should remain unchanged if no actual modifications were made to the table's schema or structure.
Actual Behavior
After importing a BigQuery table and modifying an unrelated attribute, the provider sends a change request that includes policyTags: {} for fields, even when no policy tags were present initially. This causes the eTag of the resource to change unexpectedly.
Steps to reproduce
deletion_protection
)This behaviour causes unintended schema changes, including changes in the case sensitivity of field names, when using
bq query --replace
. This is likely because thebq
command refers to the eTag to decide whether the schema should change or not.Important Factoids
This behavior has been observed in several versions of v4.80.0 and above, including the latest version; it was not observed in v4.79.0 or below.
References
https://github.com/hashicorp/terraform-provider-google/pull/15547 https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/update
b/371990367