hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.84k stars 9.19k forks source link

[Bug]: After a chain of InternalFailure/ThrottlingErrror a DynamoDB table was created but not registered in terraform's state #30460

Open Veetaha opened 1 year ago

Veetaha commented 1 year ago

Terraform Core Version

1.3.7

AWS Provider Version

4.17.1

Affected Resource(s)

Expected Behavior

DynamoDB table should be created successfully even after a chain of InternalFailure/ThrottlingError error responses. If terraform sees that the table already exists after a chain of such errors it may want to describe the table again to make sure it was created with the expected configs and include it into the state.

Actual Behavior

Terraform failed with the error during DynamoDB table creation on ResourceInUseException, but the DynamoDB table was actually created. So the table was not registered in the terraform's state, and we had to delete it manually to get rid of this resource leak.

In Cloudtrail I found that terraform made 4 CreateTable API calls. They all got error responses. The summary of their errorCode (errorMessage) fields is listed below.

The full redacted (removed irrelevant identifiers and tagging info) 4 CloudTrail events can be seen under the spoiler below

Details ```json [ { "eventVersion": "1.08", "userIdentity": { "type": "IAMUser", "principalId": "REDACTED", "arn": "arn:aws:iam::REDACTED:user/REDACTED", "accountId": "REDACTED", "accessKeyId": "REDACTED", "userName": "REDACTED" }, "eventTime": "2023-04-04T16:53:46Z", "eventSource": "dynamodb.amazonaws.com", "eventName": "CreateTable", "awsRegion": "eu-central-1", "sourceIPAddress": "R.E.D.A.C.T.E.D", "userAgent": "APN/1.0 HashiCorp/1.0 Terraform/1.3.7 (+https://www.terraform.io) terraform-provider-aws/dev (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.25 (go1.17.6; linux; amd64)", "errorCode": "ThrottlingException", "errorMessage": "An unknown error occurred", "requestParameters": { "attributeDefinitions": [ { "attributeName": "pk", "attributeType": "S" }, { "attributeName": "sk", "attributeType": "S" } ], "tableName": "elastio-qlopck-exp-1680648814-retention-policies", "keySchema": [ { "attributeName": "pk", "keyType": "HASH" }, { "attributeName": "sk", "keyType": "RANGE" } ], "billingMode": "PAY_PER_REQUEST", "tags": [ { "key": "REDACTED", "value": "REDACTED_THERE_WERE_12_WELL_DEFINED_TAGS_HERE" } ] }, "responseElements": null, "requestID": "REDACTED", "eventID": "REDACTED", "readOnly": false, "resources": [ { "accountId": "REDACTED", "type": "AWS::DynamoDB::Table", "ARN": "arn:aws:dynamodb:eu-central-1:REDACTED:table/elastio-qlopck-exp-1680648814-retention-policies" } ], "eventType": "AwsApiCall", "apiVersion": "2012-08-10", "managementEvent": true, "recipientAccountId": "REDACTED", "eventCategory": "Management", "tlsDetails": { "tlsVersion": "TLSv1.2", "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256", "clientProvidedHostHeader": "dynamodb.eu-central-1.amazonaws.com" } }, { "eventVersion": "1.08", "userIdentity": { "type": "IAMUser", "principalId": "REDACTED", "arn": "arn:aws:iam::REDACTED:user/REDACTED", "accountId": "REDACTED", "accessKeyId": "REDACTED", "userName": "REDACTED" }, "eventTime": "2023-04-04T16:53:46Z", "eventSource": "dynamodb.amazonaws.com", "eventName": "CreateTable", "awsRegion": "eu-central-1", "sourceIPAddress": "R.E.D.A.C.T.E.D", "userAgent": "APN/1.0 HashiCorp/1.0 Terraform/1.3.7 (+https://www.terraform.io) terraform-provider-aws/dev (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.25 (go1.17.6; linux; amd64)", "errorCode": "InternalFailure", "errorMessage": "An unknown error occurred", "requestParameters": { "attributeDefinitions": [ { "attributeName": "pk", "attributeType": "S" }, { "attributeName": "sk", "attributeType": "S" } ], "tableName": "elastio-qlopck-exp-1680648814-retention-policies", "keySchema": [ { "attributeName": "pk", "keyType": "HASH" }, { "attributeName": "sk", "keyType": "RANGE" } ], "billingMode": "PAY_PER_REQUEST", "tags": [ { "key": "REDACTED", "value": "REDACTED_THERE_WERE_12_WELL_DEFINED_TAGS_HERE" } ] }, "responseElements": null, "requestID": "REDACTED", "eventID": "REDACTED", "readOnly": false, "resources": [ { "accountId": "REDACTED", "type": "AWS::DynamoDB::Table", "ARN": "arn:aws:dynamodb:eu-central-1:REDACTED:table/elastio-qlopck-exp-1680648814-retention-policies" } ], "eventType": "AwsApiCall", "apiVersion": "2012-08-10", "managementEvent": true, "recipientAccountId": "REDACTED", "eventCategory": "Management", "tlsDetails": { "tlsVersion": "TLSv1.2", "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256", "clientProvidedHostHeader": "dynamodb.eu-central-1.amazonaws.com" } }, { "eventVersion": "1.08", "userIdentity": { "type": "IAMUser", "principalId": "REDACTED", "arn": "arn:aws:iam::REDACTED:user/REDACTED", "accountId": "REDACTED", "accessKeyId": "REDACTED", "userName": "REDACTED" }, "eventTime": "2023-04-04T16:53:48Z", "eventSource": "dynamodb.amazonaws.com", "eventName": "CreateTable", "awsRegion": "eu-central-1", "sourceIPAddress": "R.E.D.A.C.T.E.D", "userAgent": "APN/1.0 HashiCorp/1.0 Terraform/1.3.7 (+https://www.terraform.io) terraform-provider-aws/dev (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.25 (go1.17.6; linux; amd64)", "errorCode": "ThrottlingException", "errorMessage": "An unknown error occurred", "requestParameters": { "attributeDefinitions": [ { "attributeName": "pk", "attributeType": "S" }, { "attributeName": "sk", "attributeType": "S" } ], "tableName": "elastio-qlopck-exp-1680648814-retention-policies", "keySchema": [ { "attributeName": "pk", "keyType": "HASH" }, { "attributeName": "sk", "keyType": "RANGE" } ], "billingMode": "PAY_PER_REQUEST", "tags": [ { "key": "REDACTED", "value": "REDACTED_THERE_WERE_12_WELL_DEFINED_TAGS_HERE" } ] }, "responseElements": null, "requestID": "REDACTED", "eventID": "REDACTED", "readOnly": false, "resources": [ { "accountId": "REDACTED", "type": "AWS::DynamoDB::Table", "ARN": "arn:aws:dynamodb:eu-central-1:REDACTED:table/elastio-qlopck-exp-1680648814-retention-policies" } ], "eventType": "AwsApiCall", "apiVersion": "2012-08-10", "managementEvent": true, "recipientAccountId": "REDACTED", "eventCategory": "Management", "tlsDetails": { "tlsVersion": "TLSv1.2", "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256", "clientProvidedHostHeader": "dynamodb.eu-central-1.amazonaws.com" } }, { "eventVersion": "1.08", "userIdentity": { "type": "IAMUser", "principalId": "REDACTED", "arn": "arn:aws:iam::REDACTED:user/REDACTED", "accountId": "REDACTED", "accessKeyId": "REDACTED", "userName": "REDACTED" }, "eventTime": "2023-04-04T16:53:51Z", "eventSource": "dynamodb.amazonaws.com", "eventName": "CreateTable", "awsRegion": "eu-central-1", "sourceIPAddress": "R.E.D.A.C.T.E.D", "userAgent": "APN/1.0 HashiCorp/1.0 Terraform/1.3.7 (+https://www.terraform.io) terraform-provider-aws/dev (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.25 (go1.17.6; linux; amd64)", "errorCode": "ResourceInUseException", "errorMessage": "Attempt to change a resource which is still in use: Table is being created: elastio-qlopck-exp-1680648814-retention-policies", "requestParameters": { "attributeDefinitions": [ { "attributeName": "pk", "attributeType": "S" }, { "attributeName": "sk", "attributeType": "S" } ], "tableName": "elastio-qlopck-exp-1680648814-retention-policies", "keySchema": [ { "attributeName": "pk", "keyType": "HASH" }, { "attributeName": "sk", "keyType": "RANGE" } ], "billingMode": "PAY_PER_REQUEST", "tags": [ { "key": "REDACTED", "value": "REDACTED_THERE_WERE_12_WELL_DEFINED_TAGS_HERE" } ] }, "responseElements": null, "requestID": "REDACTED", "eventID": "REDACTED", "readOnly": false, "resources": [ { "accountId": "REDACTED", "type": "AWS::DynamoDB::Table", "ARN": "arn:aws:dynamodb:eu-central-1:REDACTED:table/elastio-qlopck-exp-1680648814-retention-policies" } ], "eventType": "AwsApiCall", "apiVersion": "2012-08-10", "managementEvent": true, "recipientAccountId": "REDACTED", "eventCategory": "Management", "tlsDetails": { "tlsVersion": "TLSv1.2", "cipherSuite": "ECDHE-RSA-AES128-GCM-SHA256", "clientProvidedHostHeader": "dynamodb.eu-central-1.amazonaws.com" } } ] ```

Relevant Error/Panic Output Snippet

╷
  │ Error: error creating DynamoDB Table: ResourceInUseException: Attempt to change a resource which is still in use: Table is being created: elastio-qlopck-exp-1680648814-retention-policies
  │ 
  │   with module.retention.aws_dynamodb_table.retention_policies,
  │   on ../../../modules/retention/main.tf line 14, in resource "aws_dynamodb_table" "retention_policies":
  │   14: resource "aws_dynamodb_table" "retention_policies" {
  │ 
  ╵

Terraform Configuration Files

resource "aws_dynamodb_table" "retention_policies" {
  name         = "elastio-qlopck-exp-1680648814-retention-policies"
  billing_mode = "PAY_PER_REQUEST"

  tags = {
       // There were 12 well-defined tags here
       foo = "bar"
  }

  hash_key  = "pk"
  range_key = "sk"

  attribute {
    name = "pk"
    type = "S"
  }

  attribute {
    name = "sk"
    type = "S"
  }
}

Steps to Reproduce

The steps to reproduce a buggy behavior in AWS are not defined. This is a rare case of AWS feeling unwell. There was an InternalFailure error returned from AWS in-between the ThrottlingErrors.

Our use case where this behaviour reproduced is that we were running a lot of integration tests in parallel. Each test deploys a terraform stack with own DynamoDB table. Thus, throttling errors are expected due to the amount of independent terraform stacks with their own unique DynamoDB tables that we deploy on CI.

So basically what we did is that we called

But again, you won't be able to reproduce AWS failures unless you mock them in your tests and use the 4 cloudtrail events I posted above in your mock data.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue