yanhuiyi commented 1 year ago

Terraform Core Version

= 1.2.0

AWS Provider Version

~> 4.16

Affected Resource(s)

aws_db_instance

Expected Behavior

RDS replica instance shouldn't recreate every time.

Actual Behavior

terraform apply command output summary,

# aws_db_instance.main_replica must be replaced
-/+ resource "aws_db_instance" "main_replica" {
      ~ address                               = "xxx-replica-dev.clfntrkokco7.ap-northeast-2.rds.amazonaws.com" -> (known after apply)
      ~ allocated_storage                     = 80 -> (known after apply)
      ~ arn                                   = "arn:aws:rds:ap-northeast-2:092492597114:db:xxx-replica-dev" -> (known after apply)
      ~ availability_zone                     = "ap-northeast-2a" -> (known after apply)
      ~ backup_retention_period               = 0 -> (known after apply)
      ~ backup_window                         = "17:27-17:57" -> (known after apply)
      ~ ca_cert_identifier                    = "rds-ca-2019" -> (known after apply)
      + character_set_name                    = (known after apply)
      - customer_owned_ip_enabled             = false -> null
      ~ db_name                               = "xxx" -> (known after apply)
      ~ db_subnet_group_name                  = "terraform-20230510021049001800000001" -> (known after apply)
      - deletion_protection                   = false -> null
      - enabled_cloudwatch_logs_exports       = [] -> null
      ~ endpoint                              = "xxx-replica-dev.clfntrkokco7.ap-northeast-2.rds.amazonaws.com:5432" -> (known after apply)
      ~ engine                                = "postgres" -> (known after apply)
      ~ engine_version                        = "13.10" -> (known after apply)
      ~ engine_version_actual                 = "13.10" -> (known after apply)
      ~ hosted_zone_id                        = "ZLA2NUCOLGUUR" -> (known after apply)
      - iam_database_authentication_enabled   = false -> null
      ~ id                                    = "xxx-replica-dev" -> (known after apply)
      + identifier_prefix                     = (known after apply)
      ~ iops                                  = 3000 -> (known after apply)
      ~ kms_key_id                            = "arn:aws:kms:ap-northeast-2:092492597114:key/62dfb2b1-38b2-4ab8-93d3-c3caaf12daaf" -> (known after apply)
      + latest_restorable_time                = (known after apply)
      ~ license_model                         = "postgresql-license" -> (known after apply)
      ~ listener_endpoint                     = [] -> (known after apply)
      ~ maintenance_window                    = "thu:14:01-thu:14:31" -> (known after apply)
      ~ master_user_secret                    = [] -> (known after apply)
      + master_user_secret_kms_key_id         = (known after apply)
      - max_allocated_storage                 = 0 -> null
      + monitoring_role_arn                   = (known after apply)
      ~ multi_az                              = false -> (known after apply)
      ~ name                                  = "xxx" -> (known after apply)
      + nchar_character_set_name              = (known after apply)
      ~ network_type                          = "IPV4" -> (known after apply)
      ~ option_group_name                     = "default:postgres-13" -> (known after apply)
      + performance_insights_kms_key_id       = (known after apply)
      ~ performance_insights_retention_period = 0 -> (known after apply)
      ~ port                                  = 5432 -> (known after apply)
      + replica_mode                          = (known after apply)
      ~ replicas                              = [] -> (known after apply)
      ~ resource_id                           = "db-4RXU2QM4CEORASUHIPMX32IEP4" -> (known after apply)
      - security_group_names                  = [] -> null
      + snapshot_identifier                   = (known after apply)
      ~ status                                = "available" -> (known after apply)
      - storage_encrypted                     = true -> null # forces replacement
      ~ storage_throughput                    = 125 -> (known after apply)
      ~ storage_type                          = "gp3" -> (known after apply)
      - tags                                  = {} -> null
      + timezone                              = (known after apply)
      ~ username                              = "xxx" -> (known after apply)
        # (14 unchanged attributes hidden)
    }

Relevant Error/Panic Output Snippet

Part of output while executing,

aws_db_instance.main_replica: Still destroying... [id=xxx-replica-dev, 4m0s elapsed]
aws_db_instance.main_replica: Still destroying... [id=xxx-replica-dev, 4m10s elapsed]
aws_db_instance.main_replica: Still destroying... [id=xxx-replica-dev, 4m20s elapsed]
aws_db_instance.main_replica: Still destroying... [id=xxx-replica-dev, 4m30s elapsed]
aws_db_instance.main_replica: Destruction complete after 4m30s
aws_db_instance.main_replica: Creating...
aws_db_instance.main_replica: Still creating... [10s elapsed]
aws_db_instance.main_replica: Still creating... [20s elapsed]
aws_db_instance.main_replica: Still creating... [30s elapsed]
aws_db_instance.main_replica: Still creating... [40s elapsed]
aws_db_instance.main_replica: Still creating... [50s elapsed]
...
aws_db_instance.main_replica: Still creating... [12m11s elapsed]
aws_db_instance.main_replica: Creation complete after 12m14s [id=xxx-replica-dev]

Terraform Configuration Files

resource "aws_db_instance" "main" {
  db_name                 = var.db_name
  identifier              = join("-", [var.db_name, lower(var.environment)])
  allocated_storage       = var.db_storage_device.size     # gigabytes
  backup_retention_period = var.db_backup_retention_period # in days
  apply_immediately       = true
  db_subnet_group_name    = aws_db_subnet_group.main.name
  availability_zone       = aws_subnet.private1.availability_zone
  engine                  = "postgres"
  engine_version          = var.db_engine_version
  instance_class          = var.db_instance_class
  multi_az                = false
  parameter_group_name    = aws_db_parameter_group.main.name
  password                = local.db_creds.password
  port                    = 5432
  publicly_accessible     = true
  storage_encrypted       = true # you should always do this
  storage_type            = var.db_storage_device.type
  username                = local.db_creds.username
  vpc_security_group_ids  = [aws_security_group.allow-postgresql.id]
  skip_final_snapshot     = true
}

resource "aws_db_instance" "main_replica" {
  identifier             = join("-", [var.db_name, "replica", lower(var.environment)])
  replicate_source_db    = aws_db_instance.main.identifier
  instance_class         = var.db_instance_class
  apply_immediately      = true
  skip_final_snapshot    = true
  vpc_security_group_ids = [aws_security_group.allow-postgresql.id]
  parameter_group_name   = aws_db_parameter_group.main.name
}

Steps to Reproduce

Change other any resources config other than RDS resources
terraform apply

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
Please see our prioritization guide for information on how we prioritize.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.

timothyclarke commented 1 year ago

Does the replica source have storage encrypted ? The times I've setup DB replication many of the properties came from the source database. It created fine, but the next time terraform was run it tries to 'correct' the config drift. In my cases 'correcting the config drift' was 'update the terraform config' rather than 'applying the .tf file'

In your case storage_encrypted = true looks to be what is causing the replacement. You can add a lifecycle rule to ignore that once the DB replica is created

yanhuiyi commented 1 year ago

Thank you @timothyclarke! Putting the property to ignore_changes working fine so far. lifecycle { ignore_changes = [storage_encrypted] }

garbelini commented 1 year ago

I just experienced something similar but it seems to have been originated by an AWS api error caused by overlapping backup and maintenance windows settings. In my case customer_owned_ip_enabled, tags and enabled_cloudwatch_logs_exports were triggering a resource replacement.

Why AWS doesn't validate this before proceeding with a very time consuming and expensive operation is beyond me.

Edit: Validating for overlap on those windows in the provider would be nice!

hashicorp / terraform-provider-aws

[Bug]: RDS replica instance destorying/create every time apply even there's no changes to RDS #31325

Terraform Core Version

AWS Provider Version

Affected Resource(s)

Expected Behavior

Actual Behavior

Relevant Error/Panic Output Snippet

Terraform Configuration Files

Steps to Reproduce

Debug Output

Panic Output

Important Factoids

References

Would you like to implement a fix?

Community Note