hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.72k stars 9.55k forks source link

JSON diff in plan/apply shows wrong changes #21473

Closed marcin-kolda closed 5 years ago

marcin-kolda commented 5 years ago

Terraform JSON diff in plan/apply shows wrong changes in GCP IAM policies and sometimes doesn't order IAM policies.

Terraform Version

Terraform v0.12.0
+ provider.google v2.7.0

Terraform Configuration Files

resource "google_project_iam_policy" "gcp_project" {
  project = "gcp-project-id"
  policy_data = "${data.google_iam_policy.gcp_project_policy.policy_data}"
}

data "google_iam_policy" "gcp_project_policy" {
  binding {
    role = "roles/owner"
    members = [
      "user:name.surname@example.com"
    ]
  }
}

Debug Output

  # google_project_iam_policy.gcp_project will be updated in-place
  ~ resource "google_project_iam_policy" "gcp_project" {
        etag        = "BwWJ37ny6K4="
        id          = "gcp-project-id"
      ~ policy_data = jsonencode(
          ~ {
              ~ bindings = [
                  ~ {
                        members = [
                            "user:name.surname@example.com",
                        ]
                      ~ role    = "roles/bigquery.metadataViewer" -> "roles/owner"
                    },
                    {
                        members = [
                            "user:name.surname@example.com",
                        ]
                        role    = "roles/owner"
                    },
                  - {
                      - members = [
                          - "user:name.surname@example.com",
                        ]
                      - role    = "roles/viewer"
                    },
                ]
            }
        )
        project     = "gcp-project-id"
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Crash Output

Expected Behavior

Terraform diff should show that roles/bigquery.metadataViewer and roles/viewer IAM roles will be removed.

Actual Behavior

Terraform diff shows that roles/bigquery.metadataViewer will be changed to roles/owner despite the latter role already existing in the diff. In GCP you can't have duplicated bindings for the same users and resources.

Steps to Reproduce

  1. Requires setting up GCP project
  2. terraform apply on GCP project
  3. Grant yourself roles/bigquery.metadataViewer and roles/viewer IAM roles
  4. Run terraform plan

Additional Context

Although this is using Google provider, the bug affects main TF diff behaviour. Email and project id were redacted from logs/config.

References

apparentlymart commented 5 years ago

Hi @marcin-kolda! Thanks for reporting this.

Unfortunately this behavior for rendering diffs inside JSON is a generic fallback behavior in Terraform's diff renderer, and so it can't make assumptions about the specific shape of the data or how it will ultimately be interpreted by the remote system.

In this case, Terraform is seeing that this is a JSON array and so it's performing an index-based correlation of the values. The first shows as an edit because Terraform sees it as element zero of the array being edited, not as a removal.

In order to do something else here, Terraform would require more information. Usually providers use JSON strings because they want to allow the full flexibility of a remote format without fully specifying it as a Terraform schema, but the tradeoff there is that without a fully-specifed Terraform schema there isn't any information to guide how to understand the relationship between old and new values, and so Terraform must guess.

In principle this could be addressed by making policy_data (or, more likely, some new attribute alongside it) be a fully-specified Terraform schema where the schema can tell Terraform that bindings is an unordered set of objects rather than an ordered sequence, but that would come at the expense of making it a little harder to use existing JSON strings already formatted as the remote API expects.

One thing we could try here is to see whether in the JSON diff mode it makes sense to try to apply a longest-common-subsequence algorithm to the elements to try to detect additions and removals in a similar way to common textual diffs. Terraform doesn't normally do this for lists because it is reflecting how Terraform and providers themselves interpret sequences vs. sets, but because JSON doesn't have a sequence vs. set distinction we may find that a traditional LCS-based diff makes a good compromise between those two positions, though it would still be ordering-sensitive because Terraform can't know automatically which JSON arrays are ordered and which are not.

marcin-kolda commented 5 years ago

Thanks @apparentlymart for detailed answer.

I guess the best option would be to treat those bindings as unordered set as GCP API reorders them.

Could we move this issue to Google Provider then? Or should I report new issue there and link it to this?

apparentlymart commented 5 years ago

The provider itself has no control over this diffing as long as it remains just a JSON string, since the provider in that case is literally just giving an arbitrary before and after and leaving Terraform Core to guess the meaning of the diff.

The provider could potentially define the structure as a real Terraform schema and thus give Terraform Core the additional information about how to interpret these objects, so you could perhaps open an issue about the use-case in the provider repository and see what the provider team thinks... defining a full schema for representing policy documents may be considered undesirable for other reasons, but it couldn't hurt to start a discussion about it.

We can still use this issue to track the idea of using an LCS diffing strategy for JSON arrays, which may be a better compromise for your average JSON-based data format, but it will require gathering a suite of common formats to test against to see if it is better in average across many different use-cases, rather than tailoring specifically to this one.

tmccombs commented 5 years ago

In this case, Terraform is seeing that this is a JSON array and so it's performing an index-based correlation of the values.

That makes sense. But then I would expect all later elements to be changed, and the last one(s) to be removed. For a simple example consider the following configuration:

resource "local_file" test {
  content = jsonencode([{
    name = "a"
  }, {
    name = "b"
  }, {
    name = "c"
  }])

  filename = "./test.json"
}

If the second element (name="b") is removed, I end up with the following diff:

  # local_file.test must be replaced
-/+ resource "local_file" "test" {
      ~ content  = jsonencode(
          ~ [ # forces replacement
                {
                    name = "a"
                },
              ~ {
                  ~ name = "b" -> "c"
                } # forces replacement,
                {
                    name = "c"
                },
            ]
        )
        filename = "./test.json"
      ~ id       = "cbcb4695d63081a13673ac23c3cbd5b09acec14f" -> (known after apply)
    }

The diff for the second element is fine: the second element of the array is now named "c". But then shouldn't the last element (the old name = "c") be removed?

One thing we could try here is to see whether in the JSON diff mode it makes sense to try to apply a longest-common-subsequence algorithm to the elements to try to detect additions and removals in a similar way to common textual diffs.

IMO this would be ideal. But honestly, even just doing a text diff between the pretty printed json would be better than the current situation, where the diff is very misleading when items are removed from an array.

tmccombs commented 5 years ago

I should also note, that the actual result of apply the plan in that example is what I would expect, it is just the diff hat is wrong.

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.