data source - latest-cluster-snapshot failed to detect the latest snapshot

SongGithub commented 2 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

v1.0.9

Affected Resource(s)

data.aws_db_cluster_snapshot.latest_snapshot

Terraform Configuration Files

data "aws_db_cluster_snapshot" "latest_snapshot" {
  db_cluster_identifier = "foo-db"
  most_recent            = true
}

resource "aws_rds_cluster" "aurora_postgres" {
  apply_immediately               = true
  availability_zones              = var.availability_zones
  backup_retention_period         = 35
  cluster_identifier              = "${var.cluster_identifier}-${random_string.cluster_id_salt.result}"
  database_name                   = "${var.dbname}"
  db_cluster_parameter_group_name = "${aws_rds_cluster_parameter_group.custom-cluster-param-grp.name}"
  db_subnet_group_name            = "${aws_db_subnet_group.db_subnet_group.name}"
  deletion_protection             = false
  engine                          = "${var.engine}"
  engine_version                  = "${var.engine_version}"
  final_snapshot_identifier       = "${var.cluster_identifier-snapshot}-${random_string.cluster_id_salt.result}"
  kms_key_id                      = "${var.server-side-kms}"
  master_password                 = "changeme99"
  master_username                 = "${var.username}"
  port                            = 5432
  preferred_backup_window         = "${var.preferred_backup_window}"
  preferred_maintenance_window    = "${var.preferred_maintenance_window}"
  skip_final_snapshot             = false
  storage_encrypted               = true
  vpc_security_group_ids          = ["${aws_security_group.db_security_group.id}"]
  snapshot_identifier             = data.aws_db_cluster_snapshot.latest_snapshot.id

Debug Output

Panic Output

Expected Behavior

it should detect the latest manual snapshot and update DB cluster with that.

Actual Behavior

it does not: module.aurora-restore.data.aws_db_cluster_snapshot.latest_snapshot: Refreshing state... [id=manual-2021-08-18-09-00-foo-db-aurora-postgresql]

0 to update

Steps to Reproduce

terraform apply

Important Factoids

References

0000

justinretzolk commented 2 years ago

Hi @SongGithub 👋 Thank you for taking the time to file this issue. So that we have all of the necessary information in order to look into this, can you provide debug logs as well?

SongGithub commented 2 years ago

hi @justinretzolk , module.aurora-restore.data.aws_db_cluster_snapshot.latest_snapshot: Refreshing state... [id=manual-2021-08-18-09-00-foo-db-aurora-postgresql] is all I can give due to IP restrictions. That indicates it is not aware of newer snapshot.

SongGithub commented 2 years ago

I was expecting it to pick up more recent snapshot, such as the ones created in November. But it managed to pick up the one from August.

I am confused when I have assigned following

data "aws_db_cluster_snapshot" "latest_snapshot" {
  db_cluster_identifier = "foo-db"
  most_recent            = true
}

justinretzolk commented 2 years ago

Hey @SongGithub 👋 Thank you for the additional updates. I attempted to reproduce this, but so far haven't been able to using Terraform 1.0.10 and AWS provider 3.64.1. I'm not sure what version of the AWS provider that you're using, but didn't see anything that seemed related in the CHANGELOG; can you confirm what version you're using?

Something that came to mind as a possible way to troubleshoot this would be to use the API or CLI to see what you get back and whether it differs from what you're seeing in the Terraform configuration.

So that you have it, here's my reproduction configuration. Note that after the main RDS cluster was created, and before adding the backup one, I've manually created a snapshot in the console, to try to mimic your scenario as closely as possible.

provider "aws" {
  region = "us-east-1"
}

variable "password" {}

resource "aws_rds_cluster" "main" {
  cluster_identifier  = "test-main"
  availability_zones  = ["us-east-1a", "us-east-1b", "us-east-1c"]
  engine              = "aurora-postgresql"
  master_password     = var.password
  master_username     = "testuser"
  skip_final_snapshot = true
}

data "aws_db_cluster_snapshot" "latest" {
  db_cluster_identifier = aws_rds_cluster.main.cluster_identifier
  most_recent           = true
}

resource "aws_rds_cluster" "backup" {
  cluster_identifier  = "test-backup"
  availability_zones  = ["us-east-1a", "us-east-1b", "us-east-1c"]
  engine              = "aurora-postgresql"
  skip_final_snapshot = true
  snapshot_identifier = data.aws_db_cluster_snapshot.latest.id
}

// Just to see what it actually outputs, in case there's a difference
output "snapshot_id" {
  value = data.aws_db_cluster_snapshot.latest.id
}

After a successful terraform apply, I took another snapshot in the console, ran terraform plan, and received the following:

$ terraform plan
aws_rds_cluster.main: Refreshing state... [id=test-main]
aws_rds_cluster.backup: Refreshing state... [id=test-backup]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # aws_rds_cluster.backup will be updated in-place
  ~ resource "aws_rds_cluster" "backup" {
        id                                  = "test-backup"
      ~ snapshot_identifier                 = "test-4" -> "test-5"
        tags                                = {}
        # (31 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Changes to Outputs:
  ~ snapshot_id = "test-4" -> "test-5"

SongGithub commented 2 years ago

@justinretzolk I aligned AWS provider version & TF version with yours, but it still not grabbing the latest. snapshot_identifier = "manual-2021-08-19-09-00-foo-db-aurora-postgresql". Given we take manual snapshot every day, the snapshot it determines, is actual almost 3 month old!

Something that came to mind as a possible way to troubleshoot this would be to use the API or CLI to see what you get back and whether it differs from what you're seeing in the Terraform configuration.

Yes, AWSCLI does give me what I need: If I ran aws rds describe-db-cluster-snapshots --db-cluster-identifier foo-db --snapshot-type manual --query="reverse(sort_by(DBClusterSnapshots, &SnapshotCreateTime))[0]" --region us-east-1, then I can have following result:

    "DBClusterSnapshotIdentifier": "songjin-manual-1",
    "DBClusterIdentifier": "foo-db",
    "SnapshotCreateTime": "2021-11-08T02:34:58.929000+00:00",

btw, can you point out source code for TF to query the latest snapshot?

anunna-trumid commented 1 year ago

@justinretzolk Hi, i am facing similar issue when trying to fetch latest manual snapshot on rds postgres. it managed to pick some manual snapshot may be 15 days old but not the latest. this issue doesn't reflect for "automated" backups. using version = "4.51.0". terraform picking below snapshot source_db_snapshot_identifier = "instance-snap-2022-12-30-08-30" where aws cli can recognize latest manual snapshot as below: "SourceDBSnapshotIdentifier": "arn:aws:rds:us-east-1:xxxxxxxxxx:snapshot:instance-snap-2023-01-25-00-30",

mmihaylov commented 1 month ago

I have similar issue, while using the terraform rds-aurora module. Underneath it creates aws_rds_cluster with one aws_rds_cluster_instance. I am setting snapshot_identifier and final_snapshot_identifier properties. Let me show you two scenarios

Scenario 1 (separate delete and create):

I have a snapshot created 5 days ago (let's call it snapshot-5daysold). Fot the time being it is the latest one!
Add a new record in one of my tables (let's call it Charlie).
Remove rds-aurora module from my configuration and terraform apply.
Resources are deleted and new snapshot is created (let's call it snapshot-latest).
Add rds-aurora module back in my configuration and terraform apply.
Resources are created. My cluster is restored from the snapshot-latest. My newly added record (Charlie) exists in the table.

Scenario 2 (delete and create in one go):

I have a snapshot created 5 days ago (let's call it snapshot-5daysold). Fot the time being it is the latest one!
Add a new record in one of my tables (let's call it Daniel).
Start new terraform apply, which requires rds cluster recreation (delete and create).
After the recreation I have a new snapshot created (let's call it spashot-today), BUT my rds cluster is restored from snapshot-5daysold. My newly added record (Daniel) does NOT exist in my table.

My assumption in that the newest snapshot is created, but in the state file it's already defined the previous one. State file is not updated with the latest snapshot and previous one is used for restoring.

hashicorp / terraform-provider-aws