hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.83k stars 9.19k forks source link

[Bug]: AWS Account Factory for Terraform RAM Resource Share issues #36791

Open jordan-k-gilead opened 7 months ago

jordan-k-gilead commented 7 months ago

Terraform Core Version

~> 1.3

AWS Provider Version

5.30.0

Affected Resource(s)

aws_ram_resource_share_accepter aws_ram_resource_share_accepter.tgw_vpc_attach

Expected Behavior

Account Request with connected VPC type is successfully provisioned via AFT

Actual Behavior

Our customer is using AWS Account Factory for Terraform and a few account requests are failing with the following error:

Error: reading RAM Resource Share (arn:aws:ram:us-west-2:xxxxxxxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx): couldn't find resource

with aws_ram_resource_share_accepter.tgw_vpc_attach[0],
on vpc_connected_baseline.tf line 34, in resource "aws_ram_resource_share_accepter" "tgw_vpc_attach":
34: resource "aws_ram_resource_share_accepter" "tgw_vpc_attach" {"

Relevant Error/Panic Output Snippet

From the log on the account *xxxxxxxx.

GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Another GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Then GetResourceShares at 2024-02-02T15:15:08.000Z encountered "UnknownResourceException".
Then AcceptResourceShareInvitation at 2024-02-02T15:15:08.000Z
GetResourceShares get successful since 2024-02-02T16:47:28.000Z with exactly the same request parameter as config: require blocks for upstream dependencies #3.
"requestParameters": {
"resourceShareArns": [
"arn:aws:ram:us-west-2:xxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx"
],
"resourceOwner": "OTHER-ACCOUNTS"
},"

Terraform Configuration Files

terraform {
  required_version = "~> 1.3"

  # AWS provider version pin to 5.30.0 for CORE-340
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "= 5.30.0"
    }
  }
}

Steps to Reproduce

  1. Account request is submitted by merging the TF file into the account-request repo
  2. The ci/cd pipeline kicks off to create the account
  3. This will run a series of checks, as well as terraform init, terraform plan, terraform apply

Debug Output

From the log on the account *xxxxxxxx.

GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Another GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Then GetResourceShares at 2024-02-02T15:15:08.000Z encountered "UnknownResourceException".
Then AcceptResourceShareInvitation at 2024-02-02T15:15:08.000Z
GetResourceShares get successful since 2024-02-02T16:47:28.000Z with exactly the same request parameter as https://github.com/hashicorp/terraform/issues/3.
"requestParameters": {
"resourceShareArns": [
"arn:aws:ram:us-west-2:xxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx"
],
"resourceOwner": "OTHER-ACCOUNTS"
},"

Panic Output

No response

Important Factoids

This is related to a previously determined issue with timing in processing RAM shares, which requires a code fix to stabilize, or manual TF state manipulations as mitigation (it is possible for the failure to repeat in the future if certain changes are made).

Right now the current workaround is to run the TF init and plan for the pipeline, let it fail, then untaint the resources and run the pipeline again. This was not happening before, but as we scale the number of RAM resource shares it is becoming more frequent. We have already tried adding gaps or “time_sleeps” inbetween the resource creations.

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 7 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue