hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.77k stars 9.13k forks source link

[Bug]: StackSet instance through OU did not loop through Account IDs and therefore it runs in a timeout #27288

Open fortunejuggle opened 1 year ago

fortunejuggle commented 1 year ago

Terraform Core Version

1.0.4

AWS Provider Version

4.35.0

Affected Resource(s)

We are trying to deploy a stackset in several accounts via terrafom. following these documentations: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudformation_stack_set https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudformation_stack_set_instance here we are using the organizational_unit_ids and the timeout configuration option on the cloudformation_stack_set_instance

The issue is, that terraform picks only the first Account ID and do create / update or delete on this Account ID and therefore it runs in the timeout what is configured on the stackset instance. But based on this Account ID stack instances in other accounts where also created.

Expected Behavior

I understand the documentation to say that the timeout is on an instance and therefore Terraform should loop over the accounts, otherwise these settings are useless.

And the the output shoud be:

aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 10s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000001,our-region-1, 20s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000002,our-region-1, 30s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000003,our-region-1, 40s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000004,our-region-1, 50s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000005,our-region-1, 1m0s elapsed]

Actual Behavior

terraform picks the first Account ID and then this Account ID is used to create / update or destroy all stack instances. Not sure if it is relevant but it is the lowest Account ID.

output:

aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 10s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 20s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 30s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 40s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 50s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 1m0s elapsed]
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1,1m10s
aws_cloudformation_stack_set_instance.name: Still destroying... [id=name,010000000000,our-region-1, 2h0m0s elapsed]

Error: error waiting for CloudFormation StackSet Instance (name,010000000000,our-region-1) deletion: timeout while waiting for state to become 'SUCCEEDED' (last state: 'RUNNING', timeout: 2h0m0s)

The Account ID is not changing, but approx. 70 Stackinstances in 70 other Accounts are destroyed.

It now comes to the fact that in the background the operation continues and is not canceled, even thought terraform terminates.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

resource "aws_cloudformation_stack_set" "this" {
  name                    = "name"
  capabilities            = ["CAPABILITY_AUTO_EXPAND","CAPABILITY_IAM"]
  permission_model = "SERVICE_MANAGED"
  auto_deployment {
    enabled = true
    retain_stacks_on_account_removal = false
  }
  operation_preferences {
    failure_tolerance_count = 2
    max_concurrent_percentage = 35
    region_order = ["<our-region-1>", "our-region-2"]
    region_concurrency_type  = "PARALLEL"
  }
  template_body           = file("cloudformation/cf.yaml")
  tags = local.default_tags
}

resource "aws_cloudformation_stack_set_instance" "this" {
  deployment_targets {
    organizational_unit_ids = [data.aws_organizations_organization.master.roots[0].id]
  }
  stack_set_name = aws_cloudformation_stack_set.this.name
  timeouts {
    create = "2h"
    update = "2h"
    delete = "2h"
  }

}

Steps to Reproduce

try to create the stackset and stackset instances according to the provided configuration file. Please note region_order needs to be changed and a valid cloudformation template path should be provided.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No response

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

fortunejuggle commented 1 year ago

I have some more output - maybe this is relevant. Terraform is changing the ID from OU to Account ID. and therefore our stack instances are permanently destroyed and recreated.

# aws_cloudformation_stack_set_instance.name has been changed
  ~ resource "aws_cloudformation_stack_set_instance" "this" {
      + account_id             = "010000000000"
      ~ id                     = "name,<our ou>,our-region-1" -> "name,010000000000,our-region-1"
      + organizational_unit_id = "<our ou>"
      + parameter_overrides    = {}
      + region                 = "our-region-1"
        # (3 unchanged attributes hidden)
        # (2 unchanged blocks hidden)
    }
fortunejuggle commented 1 year ago

We tried to do this as a workaround, but it does not work if you use the SERVICE_MANAGED permission model

resource "aws_cloudformation_stack_set_instance" "name" {
  for_each = toset(sort(local.account_ids))

  stack_set_name = aws_cloudformation_stack_set.$name.name
  account_id = "${each.key}"
  timeouts {
    create = "2h"
    update = "2h"
    delete = "2h"
  }

}

Error output:

Error: error waiting for CloudFormation StackSet Instance () creation: error creating CloudFormation StackSet ($name) Instance: ValidationError: StackSets with SERVICE_MANAGED permission model can only have OrganizationalUnit as target