hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.71k stars 9.55k forks source link

JSON plan output gives less sensitivity detail than CLI for blocks backed by sets #29618

Open alisdair opened 3 years ago

alisdair commented 3 years ago

Terraform Version

Terraform v1.1.0-dev
on darwin_amd64
+ provider registry.terraform.io/hashicorp/azurerm v2.77.0

Terraform Configuration Files

resource "azurerm_resource_group" "main" {
  name     = "test-rg"
  location = "West US 2"
}

resource "azurerm_virtual_network" "main" {
  name                = "test-network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
}

resource "azurerm_subnet" "internal" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.0.2.0/24"]
}

resource "azurerm_network_interface" "test" {
  name                = "test-nic"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name

  ip_configuration {
    name                          = "testconfiguration1"
    subnet_id                     = azurerm_subnet.internal.id
    private_ip_address_allocation = "Dynamic"
  }
}

resource "azurerm_virtual_machine" "test" {
  name                  = "tf-cloud-example"
  location              = azurerm_resource_group.main.location
  resource_group_name   = azurerm_resource_group.main.name
  network_interface_ids = [azurerm_network_interface.test.id]
  vm_size               = "Standard_D2a_v4"

  os_profile {
    computer_name  = "hostname"
    admin_username = "testadmin"
    admin_password = "T3st@dmin"
  }

  storage_os_disk {
    name              = "tf-disk"
    create_option     = "FromImage"
    caching           = "ReadWrite"
    managed_disk_type = "StandardSSD_LRS"
  }

  storage_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "16.04-LTS"
    version   = "latest"
  }

  os_profile_linux_config {
    disable_password_authentication = false
  }
}

Expected Behavior

CLI output shows details about the os_profile block attribute values changing:

      - os_profile { # forces replacement
          - admin_username = "testadmin" -> null
          - computer_name  = "hostname" -> null
        }
      + os_profile { # forces replacement
          + admin_password = (sensitive value)
          + admin_username = "testadmins"
          + computer_name  = "hostnames"
          + custom_data    = (known after apply)
        }

To allow consumers of the JSON plan to do the same, the after_sensitive data should mark the admin_password in the os_profile block as sensitive, like so:

  "after_sensitive": {
    "os_profile": [
      {
        "admin_password": true
      }
    ]
  }

Actual Behavior

The entire os_profile block is marked sensitive:

  "after_sensitive": {
    "os_profile": true
  }

Steps to Reproduce

  1. Apply the above config to create an azurerm_virtual_machine
  2. Edit all of the attribute values on the virtual machine os_profile block
  3. Run terraform plan -out=saved.tfplan
  4. Run terraform show -json saved.tfplan | jq '.resource_changes[] | select( .address == "azurerm_virtual_machine.test") | .change

Found by @brandonc.

apparentlymart commented 3 years ago

Honestly, the surprising part of this report for me was that the human-oriented report was able to distinguish sensitivity so granularly:

      - os_profile { # forces replacement
          - admin_username = "testadmin" -> null
          - computer_name  = "hostname" -> null
        }
      + os_profile { # forces replacement
          + admin_password = (sensitive value)
          + admin_username = "testadmins"
          + computer_name  = "hostnames"
          + custom_data    = (known after apply)
        }

Our mechanism for tracking dynamic sensitivity doesn't allow tracking nested sensitivity within sets, so I have to assume this is the effect of some other mechanism. I'd guess what's probably happening here is that the human-oriented diff renderer is ignoring dynamic value sensitivity altogether when rendering nested blocks -- in which case, it's disclosing values there that we'd normally conservatively consider to be sensitive -- but then the static schema for admin_password comes to the rescue, because that capability predates our dynamic sensitivity tracking.

I think we'll have an interesting tradeoff to make here, since while in this particular case it's clear that the intent is only for admin_password to be sensitive, if I'm understanding correctly then we're only accidentally matching that intent, as a result of ignoring the sensitivity mark on the os_profile set. An un-nuanced read of the dynamic sensitivity design would say that the human-oriented diff output is the one misbehaving here, because it's showing admin_username and computer_name even though we accessed them through a sensitive set value.

The JSON renderer, on the other hand, is describing things as they actually are in the dynamic value. It'd be interesting to see if there's a symmetrical opposite "miss" here, where the JSON output disregards the static sensitivity flag in the schema; if so, just naively ignoring the sensitivity flag would cause us to not mark admin_password as sensitive at all, which would also be inconsistent.


Another tricky part, though thankfully not as tricky, is that to properly represent what we saw in the human-oriented diff output here we'd need to be sure to include one element of after_sensitive.os_profile for each element in after.os_profile and ensure that the indices match up so that the consumer can actually correlate them even though sets are not naturally ordered. We'd choose an arbitrary but consistent order to use for all of the various array-based representations we include here, I suppose.

  "after_sensitive": {
    "os_profile": [
      {
        "admin_password": true
      },
      {
        "admin_password": true
      }
    ]
  }