awslabs / landing-zone-accelerator-on-aws

Deploy a multi-account cloud foundation to support highly-regulated workloads and complex compliance requirements.
https://aws.amazon.com/solutions/implementations/landing-zone-accelerator-on-aws/
Apache License 2.0
555 stars 440 forks source link

Changing TGW route table for TGW attachment fails #175

Closed zakbatinica-reply closed 1 year ago

zakbatinica-reply commented 1 year ago

Describe the bug

If you add a new transit gateway route table to the network-config.yaml and update the configuration for a transit gateway attachment that LZA has previously deployed to use that new route table, the deployment fails. The deployment error shown is in the Network_Associations stage, and it indicates the Transit Gateway Attachment is already associated to a route table.

You can currently work around this by manually deleting the existing association and re-running the pipeline. LZA then seems happy to create the new association. I say seems as afterwards the CloudFormation stack was showing it had created the AWS::EC2::TransitGatewayRouteTableAssociation resource for the attachment and correct route table but it didn't actually exist, so I had to manually create the association.

To Reproduce

  1. Deploy a transit gateway, transit gateway route table and a transit gateway attachment via LZA.
  2. Update the configuration to add a new transit gateway route table and update the transit gateway attachment so it is associated with the new transit gateway route table.
  3. Run the pipeline.

Expected behavior

LZA removes the existing association, then creates the new association.

Please complete the following information about the solution:

erwaxler commented 1 year ago

Hi @zakbatinica-reply , thank you for you interest in the Landing Zone Accelerator on AWS. When making updates to resources with dependencies, we recommend making multiple, small changes over multiple pipeline runs rather than making several modifications at once. As you observed, this is most commonly encountered with networking resources. I highly recommend reading this page that highlights best practices when making updates to resources with dependencies. Keep in mind you do not have to wait for a CodePipeline execution to complete before starting the CodePipeline again.

zakbatinica-reply commented 1 year ago

Thanks for sharing the doc, I will keep that in mind when making changes 🙂

I think this is still a bug in that LZA should be able to support changing the transit gateway route table in a manner that isn't disruptive, e.g. needing to remove the transit gateway attachment first. If that doesn't fit with the intent of LZA then let me know, and I will close this.

erwaxler commented 1 year ago

No problem! The root of the problem is that CloudFormation will always perform CREATE operations before DELETE operations. Because a TGW Attachment can only have one association at a time, you must remove the previous association before adding a new one. We've detailed this behavior in our TransitGatewayAttachmentConfig documentation:

CAUTION: Changing this value after initial deployment causes a new association to be created. Attachments can only have a single association at a time. To avoid core pipeline failures, use multiple core pipeline runs to 1) delete the existing association and then 2) add the new association.

Since LZA operations are built on top of CloudFormation, this isn't something the team will be able to implement at this time. I'm going to go ahead and close this issue, thank you again for your interest in the LZA and please continue to create issues where you see room for improvement.