AviatrixSystems / terraform-provider-aviatrix

Terraform Aviatrix provider
https://www.terraform.io/docs/providers/aviatrix/
Mozilla Public License 2.0
44 stars 50 forks source link

AWS Peering state mismatch after upgrade to 3.0.1 (UserConnect-7.0.1373) #1532

Open dxciberaws opened 1 year ago

dxciberaws commented 1 year ago

After controller and gateways update to UserConnect-7.0.1373 and TF plugin upgraded to 3.0.1, the existing TF AWS peering resources that were previously defined are found to not match the exisiting configured peerings about their route table lists.

A terraform apply command results in all AWS peering resources requiring replacement to update their route table lists.

If the terraform apply is responded with yes the resources are destroyed and created again, but a next terraform apply command will again require to recreate all of them for the same mismatch of route table lists.

Steps to reproduce

Expected behavior

Running a terraform apply results in no changes required

Additional data / screenshots

Sample output for one of the defined AWS peering resources (all existing are affected by the issue):

  # aviatrix_aws_peer.sec-sbx must be replaced
-/+ resource "aviatrix_aws_peer" "sec-sbx" {
      ~ id            = "vpc-66b02c0d~vpc-01918fab5e6792eba" -> (known after apply)
      ~ rtb_list1     = [ # forces replacement
          - "rtb-db434eb0",
          - "rtb-3a424f51",
            "rtb-5c404d37",
            "rtb-86404ded",
          + "rtb-db434eb0",
          + "rtb-3a424f51",
        ]
      ~ rtb_list2     = [ # forces replacement
          - "rtb-086467e3ce4b50187",
          - "rtb-028657d2127b7458b",
          - "rtb-0b45932b5b6188479",
          - "rtb-0ffe6b88a7e9ae1d8",
          - "rtb-02a64f885b9b1a594",
          - "rtb-0296c8da84d8ec5c1",
          - "rtb-063ce2461a82cc3b0",
          - "rtb-00fb6b53c6aa41906",
          - "rtb-032b125694f120d75",
          + "all",
        ]
        # (6 unchanged attributes hidden)
    }

Plan: 7 to add, 0 to change, 7 to destroy.

NOTICE how two route table IDs are selected to be removed and also added, and that "all" is being compared against the expanded list of all route table IDs

Environment:

Additional context

No updates to defined resources are possible without incurring in the destruction and re-creation of all the existing AWS peerings

A change in the list_aws_peering behaviour may be the root cause. As observed in the code, the route tables lists are obtained from the list_aws_peering API call. The terraform provider code does not sort or process the resulting list of values in any way, and does not treat the attribute as a list in the way that the comparison operation is performed (should be comparing all elements). A change in the way the API returns the values, such as ordering them differently would result in the provider incorrectly detecting that the values are different while they are in fact the same.

On the other hand, the provider is not updating the state with the new list_aws_peering API result, as in a next execution of terraform apply the issue happens again.

The resource creation API call seems to produce the list of elements in a different order than the list_aws_peering API, hence making the created state not match the refreshed state.

On creation of the resource a deferred read of the resource from the controller was introduced:

    flag := false
    defer resourceAviatrixAWSPeerReadIfRequired(d, meta, &flag)

    _, err := client.CreateAWSPeer(awsPeer)
    if err != nil {
        return fmt.Errorf("failed to create Aviatrix AWSPeer: %s", err)
    }

    return resourceAviatrixAWSPeerReadIfRequired(d, meta, &flag)
}

func resourceAviatrixAWSPeerReadIfRequired(d *schema.ResourceData, meta interface{}, flag *bool) error {
    if !(*flag) {
        *flag = true
        return resourceAviatrixAWSPeerRead(d, meta)
    }
    return nil
}

The logic there sets flag=false which causes that the state is then read. If the resource exists, its details are are read, if not nil is returned (https://github.com/AviatrixSystems/terraform-provider-aviatrix/blob/d6417b711c34210831db9e6a44f7b79620300c07/aviatrix/resource_aviatrix_aws_peer.go#L141).

Then at the end of the block the resource read is repeated, but this time flag will always be true both if the resource existed or not, causing that this last invocation returns always nil and the resource's data is not read again, so the route table list attribute is not expanded and remains with the "all" value.

In our example, we verified that rtblist1 is affected by a change in the order of the elements. Changing the definition of the resource in the TF file to put the first two values last, fixed the issue for those, as seen in our test:

  # aviatrix_aws_peer.sec-sbx must be replaced
-/+ resource "aviatrix_aws_peer" "sec-sbx" {
      ~ id            = "vpc-66b02c0d~vpc-01918fab5e6792eba" -> (known after apply)
      ~ rtb_list2     = [ # forces replacement
          - "rtb-086467e3ce4b50187",
          - "rtb-028657d2127b7458b",
          - "rtb-0b45932b5b6188479",
          - "rtb-0ffe6b88a7e9ae1d8",
          - "rtb-02a64f885b9b1a594",
          - "rtb-0296c8da84d8ec5c1",
          - "rtb-063ce2461a82cc3b0",
          - "rtb-00fb6b53c6aa41906",
          - "rtb-032b125694f120d75",
          + "all",
        ]
        # (7 unchanged attributes hidden)
    }

The original state (from the previous version) contains the route table values in the order they are defined in the TF file, while now the controller returns them in a different order. Terraform does not apply any ordering to the list and assumes that the order may be significant to the resource's definition, causing the difference to be marked as a mismatch. It is the responsibility of the provider to handle or not a different order of values and provide Terraform with a consistent result to be stored in the state.

dxciberaws commented 1 year ago

Bug is confirmed in support's lab. As of now a workaround is to include

lifecycle { ignore_changes = [rtb_list1, rtb_list2]}

In the resource's definition.