hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.61k stars 9k forks source link

[Bug]: Cannot delete FSX File System when creation has failed and `skip_final_backup` is set to false. #32116

Open bgshacklett opened 1 year ago

bgshacklett commented 1 year ago

Terraform Core Version

1.4.6

AWS Provider Version

4.67.0

Affected Resource(s)

Expected Behavior

I should be able to delete the file system.

Actual Behavior

When running a second apply, which will recreate the file system, I receive an error stating:

│ Error: deleting FSx Windows File System (fs-<id>): BadRequest: Cannot take backup while fs-<id> is in FAILED and the file system storage is unhealthy

If I attempt to update the skip_final_backup option to set it to true, I get an error indicating that the file system can't be modified:

module.fsx_filesystem_windows.aws_fsx_windows_file_system.main: Modifying... [id=fs-<id>]
╷
│ Error: updating FSx Windows File System (fs-<id>): BadRequest: Cannot update file system fs-<id> while in FAILED state.

Relevant Error/Panic Output Snippet

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.fsx_filesystem_windows.aws_fsx_windows_file_system.main will be updated in-place
  ~ resource "aws_fsx_windows_file_system" "main" {
        id                                = "fs-<id>"
      ~ skip_final_backup                 = false -> true
        tags                              = {
            "Name" = "fsx"
        }
        # (17 unchanged attributes hidden)

        # (2 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.fsx_filesystem_windows.aws_fsx_windows_file_system.main: Modifying... [id=fs-<id>]
╷
│ Error: updating FSx Windows File System (fs-<id>): BadRequest: Cannot update file system fs-<id> while in FAILED state.
│
│   with module.fsx_filesystem_windows.aws_fsx_windows_file_system.main,
│   on ../../fsx.tf line 1, in resource "aws_fsx_windows_file_system" "main":
│    1: resource "aws_fsx_windows_file_system" "main" {
│
╵

Terraform Configuration Files

private

Steps to Reproduce

  1. Create an FSX file system for Windows with the following configuration: 1.1. An incorrect password 1.2. The skip_final_backup value set to false
  2. The FSX file system will be created, but will be FAILED/unhealthy. You should receive output similar to the following:
    │ Error: waiting for FSx Windows File System (fs-<id>) create: unexpected state 'FAILED', wanted target 'AVAILABLE'. last error: File system creation failed. Amazon FSx is unable to communicate with your Microsoft Active Directory domain controller(s). This is because Amazon FSx can't reach the DNS servers provided or domain controllers for your domain. To fix this problem, delete your file system and create a new one with valid DNS servers and networking configuration that allows traffic from the file system to the domain controller as recommended in the Amazon FSx user guide: https://docs.aws.amazon.com/fsx/latest/WindowsGuide/self-manage-prereqs.html.
  3. Fix the password and attempt to run another apply. You should receive an error stating that a final backup cannot be taken:
    │ Error: deleting FSx Windows File System (fs-<id>): BadRequest: Cannot take backup while fs-<id> is in FAILED and the file system storage is unhealthy

    The file system resource will be tainted at this point.

  4. Untaint the file system:
    terraform untaint <path.to.file.system>
  5. Update the skip_final_backup option; set it to true
  6. Run another apply:
    terraform apply
  7. An error will be received indicating that the file system cannot be updated:
    │ Error: updating FSx Windows File System (fs-<id>): BadRequest: Cannot update file system fs-<id> while in FAILED state.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

bgshacklett commented 1 year ago

This is another case where a "meta-argument" (skip_final_backup), which represents a flag intended for imperative activity, is stored in the state file. Because it's stored in the state file, an apply must be made to update the state file before the change can take place. I.e.: successfully deleting the file system would take two applies (one to set the flag, one more to actually delete the file system).

Unfortunately, while this should be a state-only update, the provider seems to be doing something with the resource and AWS immediately kicks it back due to the state of the file system.