hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.72k stars 9.08k forks source link

[Bug]: Error deleting aws_ses_event_destination #27942

Open mkielar opened 1 year ago

mkielar commented 1 year ago

Terraform Core Version

1.3.1

AWS Provider Version

4.40.0

Affected Resource(s)

Expected Behavior

Both resources were actually gone before the issue explained in pt. 1 of "Actual behaviour" appeared. So it seems terraform / AWS deleted the resources successfully, but then Terraform failed to get along. I would expect Terraform to continue when it verifies that both resources are gone.

Actual Behavior

The failure was twofold:

  1. First, terraform failed with following output:
    │ Error: ConfigurationSetDoesNotExist: Configuration set <staging> does not exist.
    │   status code: 400, request id: 3fd8613d-fcf2-49d7-9c28-f8aee9a6a536
    ╷
    │ Error: ConfigurationSetDoesNotExist: Configuration set <prod> does not exist.
    │   status code: 400, request id: 6637eeb7-5e8c-443f-9956-073c4a772310

    What's interesting at this point, is that there were also two other pairs of Configuration Set and Event Destination in that template, and all of them got deleted at the same time. Yet only the ones for prod and staging caused the issue.

Note, that at this point, all Configuration Sets and Event Destinations were gone. I believe, that one of the following happened:

Note also, that at this point, terraform plan showed that there are no changes to perform, but terraform apply failed one more time (see pt. 2 below).

  1. Once the above happened, I retried the terraform apply job in my CI, hoping it will just resolve. This time, the error was different:
    
    │ Error: Resource node has no configuration attached
    │ 
    │ The graph node for
    │ module.shared_account_baseline.module.ses_domain_identity["staging"].aws_ses_event_destination.cloudwatch
    │ has no configuration attached to it. This suggests a bug in Terraform's
    │ apply graph builder; please report it!

│ Error: Resource node has no configuration attached │ │ The graph node for │ module.shared_account_baseline.module.ses_domain_identity["prod"].aws_ses_event_destination.cloudwatch │ has no configuration attached to it. This suggests a bug in Terraform's │ apply graph builder; please report it!


After this error I restarted the `terraform apply` CI Job one more time, and this time it passed.

### Relevant Error/Panic Output Snippet

```shell
N/A

Unfortunately that was an Azure DevOps CI Pipeline, so I just restarted the job several times, until, finally, terraform apply passed.

Terraform Configuration Files

Following resources were provisioned within a reusable ses module, and were removed:

resource "aws_ses_configuration_set" "this" {
  name = var.name

  delivery_options {
    tls_policy = "Require"
  }
}

resource "aws_ses_event_destination" "cloudwatch" {
  name                   = var.name
  configuration_set_name = aws_ses_configuration_set.this.name
  enabled                = true
  matching_types         = ["send", "reject", "bounce", "delivery"]

  cloudwatch_destination {
    default_value  = "default"
    dimension_name = "ses:from-domain"
    value_source   = "emailHeader"
  }
}

The interesting part is that the aws_ses_event_destination is clearly dependent on the aws_ses_configuration_set via conviguration_set_name attribute, and yet it seems like terraform deleted the ConfigurationSet first, and only then attempted to delete the EventDestination. Or, that there was some eventual consistency issue in AWS, and Terraform was not ready for it.

Steps to Reproduce

  1. Deploy a ConfigurationSet and EventDestination as explained in Terraform Configuration Files
  2. Remove the resources from HCL
  3. Run terraform apply
  4. Hope for the best. Unfortunately, I was not able to reproduce this issue, and it seems pretty random.

Debug Output

N/A

Panic Output

N/A.

Important Factoids

The issue originally appeared only for two ConfigurationSet <=> EventDestination pairs out of four we had on given AWS Account, and only on one AWS Account out of five that we cleaned up in that CI Pipeline.

This either looks like a race condition or terraform not being ready for some internal AWS inconsistencies on API after the resources get deleted.

References

No response

Would you like to implement a fix?

No response

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue