hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.73k stars 9.09k forks source link

[Bug]: Assume Role Renewal happens exactly around expiration, causes some waitFor* operations to error due to expiration #37647

Open saedx1 opened 3 months ago

saedx1 commented 3 months ago

Terraform Core Version

1.5.6 - I don't think it matters

AWS Provider Version

v5.50.0

Affected Resource(s)

aws_ssm_association (but I am sure this affects other things)

Expected Behavior

I expect the provider to renew the assume role credentials a little bit earlier than the duration specified. However, it seems like it actually waits for the exact expiration time to do it.

I get the following error when having to wait for an ssm association:

│ Error: waiting for SSM Association (3a74c276-0ee8-4eec-95b3-a694e23c7e9c) create: operation error SSM: DescribeAssociation, https response error StatusCode: 400, RequestID: 0981bb66-74c3-49f2-882d-859fe217f64c, api error ExpiredTokenException: The security token included in the request is expired

I believe the provider here is constantly checking DescribeAssociation and, because the renewal is around the expiration, it fails this call (as it is not blocked by the renewal).

Actual Behavior

I would expect the provider to actually do it fairly earlier OR block all API calls until the renewal happens.

Relevant Error/Panic Output Snippet

│ Error: waiting for SSM Association (3a74c276-0ee8-4eec-95b3-a694e23c7e9c) create: operation error SSM: DescribeAssociation, https response error StatusCode: 400, RequestID: 0981bb66-74c3-49f2-882d-859fe217f64c, api error ExpiredTokenException: The security token included in the request is expired

### Terraform Configuration Files

```tf
provider "aws" {
  region = var.aws_region

  dynamic "assume_role" {
    for_each = var.assume_provider_role ? [1] : []
    content {
      role_arn     = var.assume_provider_role ? var.target_deployment_role : ""
      session_name = var.assume_provider_role ? "terraform" : ""
    }
  }
  default_tags {
    tags = merge(local.tags, var.tags)
  }
}

...

resource "aws_ssm_association" "invoke_ssm_document" {
  association_name                 = "SQLAutomationDocumentAssociation"
  name                             = "XXX"
  wait_for_success_timeout_seconds = 5400
  parameters = {
    ...
  }
}

Steps to Reproduce

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

Yes

github-actions[bot] commented 3 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue