Account Request with connected VPC type is successfully provisioned via AFT
Actual Behavior
Our customer is using AWS Account Factory for Terraform and a few account requests are failing with the following error:
Error: reading RAM Resource Share (arn:aws:ram:us-west-2:xxxxxxxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx): couldn't find resource
with aws_ram_resource_share_accepter.tgw_vpc_attach[0],
on vpc_connected_baseline.tf line 34, in resource "aws_ram_resource_share_accepter" "tgw_vpc_attach":
34: resource "aws_ram_resource_share_accepter" "tgw_vpc_attach" {"
Relevant Error/Panic Output Snippet
From the log on the account *xxxxxxxx.
GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Another GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Then GetResourceShares at 2024-02-02T15:15:08.000Z encountered "UnknownResourceException".
Then AcceptResourceShareInvitation at 2024-02-02T15:15:08.000Z
GetResourceShares get successful since 2024-02-02T16:47:28.000Z with exactly the same request parameter as config: require blocks for upstream dependencies #3.
"requestParameters": {
"resourceShareArns": [
"arn:aws:ram:us-west-2:xxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx"
],
"resourceOwner": "OTHER-ACCOUNTS"
},"
Terraform Configuration Files
terraform {
required_version = "~> 1.3"
# AWS provider version pin to 5.30.0 for CORE-340
required_providers {
aws = {
source = "hashicorp/aws"
version = "= 5.30.0"
}
}
}
Steps to Reproduce
Account request is submitted by merging the TF file into the account-request repo
The ci/cd pipeline kicks off to create the account
This will run a series of checks, as well as terraform init, terraform plan, terraform apply
Debug Output
From the log on the account *xxxxxxxx.
GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Another GetResourceShareInvitations at 2024-02-02T15:15:06.000Z
Then GetResourceShares at 2024-02-02T15:15:08.000Z encountered "UnknownResourceException".
Then AcceptResourceShareInvitation at 2024-02-02T15:15:08.000Z
GetResourceShares get successful since 2024-02-02T16:47:28.000Z with exactly the same request parameter as https://github.com/hashicorp/terraform/issues/3.
"requestParameters": {
"resourceShareArns": [
"arn:aws:ram:us-west-2:xxxxxxx:resource-share/xxxxxxxxxxxxxxxxxxxxx"
],
"resourceOwner": "OTHER-ACCOUNTS"
},"
Panic Output
No response
Important Factoids
This is related to a previously determined issue with timing in processing RAM shares, which requires a code fix to stabilize, or manual TF state manipulations as mitigation (it is possible for the failure to repeat in the future if certain changes are made).
Right now the current workaround is to run the TF init and plan for the pipeline, let it fail, then untaint the resources and run the pipeline again. This was not happening before, but as we scale the number of RAM resource shares it is becoming more frequent. We have already tried adding gaps or “time_sleeps” inbetween the resource creations.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
Volunteering to Work on This Issue
If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.
Terraform Core Version
~> 1.3
AWS Provider Version
5.30.0
Affected Resource(s)
aws_ram_resource_share_accepter aws_ram_resource_share_accepter.tgw_vpc_attach
Expected Behavior
Account Request with connected VPC type is successfully provisioned via AFT
Actual Behavior
Our customer is using AWS Account Factory for Terraform and a few account requests are failing with the following error:
Relevant Error/Panic Output Snippet
From the log on the account *xxxxxxxx.
Terraform Configuration Files
Steps to Reproduce
Debug Output
From the log on the account *xxxxxxxx.
Panic Output
No response
Important Factoids
This is related to a previously determined issue with timing in processing RAM shares, which requires a code fix to stabilize, or manual TF state manipulations as mitigation (it is possible for the failure to repeat in the future if certain changes are made).
Right now the current workaround is to run the TF init and plan for the pipeline, let it fail, then untaint the resources and run the pipeline again. This was not happening before, but as we scale the number of RAM resource shares it is becoming more frequent. We have already tried adding gaps or “time_sleeps” inbetween the resource creations.
References
No response
Would you like to implement a fix?
None