aws-ia / terraform-aws-control_tower_account_factory

AWS Control Tower Account Factory
Apache License 2.0
604 stars 386 forks source link

aft-invoke-customizations step function fails at `identify_targets` state when the returned result size exceeds 256Kb #414

Closed shashanklmurthy closed 5 months ago

shashanklmurthy commented 5 months ago

AFT Version: 1.10.4

Terraform Version & Provider Versions

terraform version

>= 1.4.2

terraform providers

aws = {
     source  = "hashicorp/aws"
     version = ">= 3.15"
}

Bug Description When trying to invoke customizations using the aft-invoke-customizations step function using the below input,

Input

{
  "include": [
    {
      "type": "all"
    }
  ]
}

the identify_targets state fails with the below error.

Error

{
  "Error": "States.DataLimitExceeded",
  "Cause": "The state/task 'arn:aws:lambda:us-east-1:**************:function:aft-customizations-identify-targets' returned a result with a size exceeding the maximum number of bytes service limit."
}

The error arises due to the output of the aft-customizations-identify-targets lambda exceeding 256KB service quota for input or output size for a task, state, or execution in step functions.

We currently provision over a hundred AWS accounts using AFT and the error prevents us from effectively rolling out infrastructure updates to all or a sizeable chunk of accounts without batching the updates.

To Reproduce Steps to reproduce the behavior:

  1. Provision a large number of accounts using AFT (we have over 120 accounts managed by AFT)
  2. Go to Step Functions > State Machines > aft-invoke-customizations in the console.
  3. Start a new execution using the input from above
  4. This will result in the lambda returning an output that exceeds the maximum size limit.

Expected behavior The aft-invoke-customizations > identify_targets state should be designed to be resilient to the States.DataLimitExceeded error that can occur when the number of accounts managed by AFT is large.

Sanjan611 commented 5 months ago

Hi @shashanklmurthy , thanks for bringing this up. We have a backlog tracking the same issue (#298), please follow that for updates!

Sanjan611 commented 5 months ago

Closing this issue as duplicate of #298