PaloAltoNetworks / terraform-provider-cloudngfwaws

The Terraform provider for the Palo Alto Networks AWS cloud NGFW
Mozilla Public License 2.0
14 stars 14 forks source link

Perpetual plan diff may occur after applying multiple 'subnet_mapping' blocks for `cloudngfwaws_ngfw` resource #30

Closed ahuseby closed 2 months ago

ahuseby commented 2 months ago

Describe the bug

After upscaling from 1 subnet mapping, eu-north-1a, to 3, eu-north-1a, eu-north-1b, eu-north-1c (using customer managed endpoints), we get perpetual plan diff (terraform wants subnet_mapping for eu-north-1a and eu-north-1b to be swapped).

Current subnet_mapping inputs for the cloudngfwaws_ngfw resource:

resource "cloudngfwaws_ngfw" "ngfw" {
    name                             = "cngfw-eu-north-1"

...

    subnet_mapping {
        availability_zone    = "eu-north-1b"
    }
    subnet_mapping {
        availability_zone    = "eu-north-1a"
    }
    subnet_mapping {
        availability_zone    = "eu-north-1c"
    }
}

Expected behavior

After successfully applying the above subnet_mapping configuration, a new plan command should show no changes to the cloudngfwaws_ngfw resource.

Current behavior

After successfully running apply, a plan command afterwards shows this output with changes

  # cloudngfwaws_ngfw.ngfw will be updated in-place
  ~ resource "cloudngfwaws_ngfw" "ngfw" {
        name                             = "cngfw-eu-north-1"
        # (15 unchanged attributes hidden)

      ~ subnet_mapping {
          ~ availability_zone    = "eu-north-1b" -> "eu-north-1a"
            # (2 unchanged attributes hidden)
        }
      ~ subnet_mapping {
          ~ availability_zone    = "eu-north-1a" -> "eu-north-1b"
            # (2 unchanged attributes hidden)
        }

        # (1 unchanged block hidden)
    }

When attempting to downscale the deployment to only one AZ/fw-endpoint (in eu-north-1a), we get this error:

  # cloudngfwaws_ngfw.ngfw will be updated in-place
  ~ resource "cloudngfwaws_ngfw" "ngfw" {
        name                             = "cngfw-eu-north-1"
        tags                             = {}
        # (15 unchanged attributes hidden)

      ~ subnet_mapping {
          ~ availability_zone    = "eu-north-1b" -> "eu-north-1a"
            # (2 unchanged attributes hidden)
        }
      - subnet_mapping {
          - availability_zone    = "eu-north-1a" -> null
          - availability_zone_id = "eun1-az1" -> null
            # (1 unchanged attribute hidden)
        }
      - subnet_mapping {
          - availability_zone    = "eu-north-1c" -> null
          - availability_zone_id = "eun1-az3" -> null
            # (1 unchanged attribute hidden)
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

cloudngfwaws_ngfw.ngfw: Modifying...
╷
│ Error: Error(1): Endpoint vpce-abc123 exists in disassociated zone eu-north-1a, please remove endpoint and try again.
│ 
...

Possible solution

Seems like the inputs get rearranged when submitted to the NGFW API, or in the state file, then terraform detects that it's different from the order in the configuration.

I believe changing the input type would fix this. From schema.TypeList to schema.TypeSet: https://github.com/PaloAltoNetworks/terraform-provider-cloudngfwaws/blob/5c628aa8e739627152bca93a07cbf1ebdce98add/internal/provider/ngfw.go#L397-L423

Steps to reproduce

  1. Write config for cloudngfwaws_ngfw resource with only one subnet_mapping to eu-north-1a, apply it.
  2. Update cloudngfwaws_ngfw config with additional subnet_mappings for eu-north-1b and eu-north-1b below eu-north-1a (and order them according to name, a -> b -> c), apply it.
  3. Run plan, observe that terraform wants to reorder subnet_mapping AZ names.

Using different region/AZs may work. I don't know. Also I have no idea if this behavior is consistent.

Context

We use customer managed endpoints.

We initially had only one AZ/subnet mapping (eu-north-1a).

Today we upscaled it to 3 AZs (eu-north-1a, eu-north-1b, eu-north-1c). The perpetual diff occurded after we upscaled. The firewall and VPC/endpoints deployment seems to work fine, but terraform is not happy.

Your Environment

ahuseby commented 2 months ago

This seems to have been caused by a tainted ngfw instance, whereby the API returned erroneous information about its subnet mapping state. After provisioning a new NGFW, we did not observe the same issue.