aws-ia / terraform-aws-control_tower_account_factory

AWS Control Tower Account Factory
Apache License 2.0
605 stars 386 forks source link

ThrottlingException when calling the ListPipelineExecutions API in Invoke Customization Stepfunction #351

Closed smoneyan closed 10 months ago

smoneyan commented 1 year ago

Terraform Version & Prov:

AFT Version: 1.10.1

Terraform Version & Provider Versions

terraform version

1.4.4

Bug Description A clear and concise description of what the bug is.

/aft/config/customizations/maximum_concurrent_customizations - 5

Calling get_running_pipeline_count() fails when calling ListPipelineExecutions API.

https://github.com/aws-ia/terraform-aws-control_tower_account_factory/blob/main/src/aft_lambda/aft_customizations/aft_customizations_get_pipeline_executions.py#L26-L27

       pipelines = list_pipelines(session)
        running_pipelines = get_running_pipeline_count(session, pipelines)

https://github.com/aws-ia/terraform-aws-control_tower_account_factory/blob/main/sources/aft-lambda-layer/aft_common/codepipeline.py#L113-L116

        paginator = client.get_paginator("list_pipeline_executions")
        pages = paginator.paginate(pipelineName=name)
        for page in pages:
            pipeline_execution_summaries.extend(page["pipelineExecutionSummaries"])

To Reproduce Steps to reproduce the behavior:

  1. Number of pipelines - 160. # pipelines = list_pipelines(session) - Number of total pipelines we have is around 160
  2. Running the following code locally fails with the following exception.
       pipelines = list_pipelines(session)
        running_pipelines = get_running_pipeline_count(session, pipelines)

Error : An error occurred (ThrottlingException) when calling the ListPipelineExecutions operation (reached max retries: 4): Rate exceeded

Expected behavior

Looks like we have the default boto3 retries (3), is it possible to increase this to a higher number which solves the problem locally in my case.

or Any other way to reduce the API calls( instead of going through the list of all codepipelines), so that the API throttling can be avoided ?

config = Config(
        retries = {
            'max_attempts': 10,
        }
    )
    client = session.client("codepipeline",config=config)
    for name in pipeline_names:
        print("Pipeline Name: " + name)
        logger.info("Getting pipeline executions for " + name)
        pipeline_execution_summaries = []

        paginator = client.get_paginator("list_pipeline_executions")
        pages = paginator.paginate(pipelineName=name)
        for page in pages:
            pipeline_execution_summaries.extend(page["pipelineExecutionSummaries"])

Related Logs

{
  "error": "ClientError",
  "cause": {
    "errorMessage": "An error occurred (ThrottlingException) when calling the ListPipelineExecutions operation (reached max retries: 4): Rate exceeded",
    "errorType": "ClientError",
    "requestId": "86b55cb5-5f45-45ac-8c7c-8902c9c57456",
    "stackTrace": [
      "  File \"/var/task/aft_customizations_get_pipeline_executions.py\", line 27, in lambda_handler\n    running_pipelines = get_running_pipeline_count(session, pipelines)\n",
      "  File \"/opt/python/lib/python3.9/site-packages/aft_common/codepipeline.py\", line 115, in get_running_pipeline_count\n    for page in pages:\n",
      "  File \"/opt/python/lib/python3.9/site-packages/botocore/paginate.py\", line 269, in __iter__\n    response = self._make_request(current_kwargs)\n",
      "  File \"/opt/python/lib/python3.9/site-packages/botocore/paginate.py\", line 357, in _make_request\n    return self._method(**current_kwargs)\n",
      "  File \"/opt/python/lib/python3.9/site-packages/botocore/client.py\", line 530, in _api_call\n    return self._make_api_call(operation_name, kwargs)\n",
      "  File \"/opt/python/lib/python3.9/site-packages/botocore/client.py\", line 960, in _make_api_call\n    raise error_class(parsed_response, operation_name)\n"
    ]
  }
}

Additional context Add any other context about the problem here.

snebhu3 commented 1 year ago

@smoneyan thank you for reporting the issue. I have created a backlog to address this.

stumins commented 11 months ago

Hi @smoneyan,

We just released AFT 1.10.4 with improved retry behavior to address this type of throttling issue.

Please upgrade to 1.10.4 and let us know if you continue to experience any throttling exceptions.

stumins commented 10 months ago

We haven't received any reports of continued throttling, so I'm going to go close this issue as completed. Please feel free to open a new ticket if you continue to experience throttling issues on versions >= 1.10.4