hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.79k stars 9.14k forks source link

[Bug]: Terraform didn't wait for RDS to finish modifying #31445

Open gi02sap opened 1 year ago

gi02sap commented 1 year ago

Terraform Core Version

0.15.5

AWS Provider Version

3.76.1

Affected Resource(s)

No response

Expected Behavior

terraform apply should wait for RDS to finish modifying.

Actual Behavior

terraform apply for increasing memory of RDS instance completes within 31 seconds of modifying then mentions it successfully completes.

Subsequent apply tries to make changes, but AWS complains and mentions it is not in an available state.

Relevant Error/Panic Output Snippet

aws_db_instance.rds-postgresql_instance: Still modifying... [id=XXX, 10s elapsed]
aws_db_instance.rds-postgresql_instance: Still modifying... [id=XXX, 20s elapsed]
aws_db_instance.rds_postgresql_instance: Still modifying... [id=XXX, 30s elapsed]
Apply complete! Resources: 0 added, 2 changed, 0 destroyed.
aws_db_instance.rds_postgresql_instance: Modifications complete after 31s [id=XXX]
Terraform operation successfully completed!!!

Subsequent apply attempts lead to:
Error modifying DB Instance XXX: InvalidDBInstanceState: Database instance is not in available state.

Terraform Configuration Files

N/A

Steps to Reproduce

Apply memory update via terraform.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 1 year ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 1 year ago

Hey @gi02sap 👋 Thank you for taking the time to raise this! So that we have the information needed to look into this, can you supply a sample Terraform configuration and debug logs (redacted as needed)?

gi02sap commented 1 year ago

Hi @justinretzolk, unfortunately the entire deployment is based on customer-set pipelines and I don't have access to the configuration + they can't provide it nor debug logs. I searched a bit regarding this issue and found similar one in the past: https://github.com/hashicorp/terraform-provider-aws/issues/1761

jkburges commented 1 year ago

I think I may have a reasonably minimal example which seems likely to have a similar cause to this bug report - please let me know if you want me to raise a separate issue though. It's distilled from some real code I'm working on, so it's likely not as minimal as could be to re-create the bug.

What I am seeing at the end of terraform apply is:

Apply complete! Resources: 7 added, 0 changed, 0 destroyed.

Outputs:

database_connection_parameters = {
  "DB_HOSTNAME" = tostring(null)
  "DB_NAME" = "thedatabase"
  "DB_USERNAME" = "agm9kbklca"
}

Note the invalid value for output DB_HOSTNAME - what I expect is that this should be something like webapp-primary-snip.us-east-1.rds.amazonaws.com. In my real code, I am feeding this value to another resource (an Elastic Beanstalk environment) and the fact that it's not valid is causing an error.

Terraform config:

terraform {
  required_providers {
    aws = {
      version = "~> 5"
    }
  }
}

variable "webapp_primary_max_allocated_storage" {
  default = 1000
}

resource "random_id" "identifier" {
  byte_length = 3
}

data "aws_vpc" "default" {
  default = true
}

resource "aws_db_parameter_group" "webapp-postgres14" {
  name   = "webapp-pg-14-${random_id.identifier.hex}"
  family = "postgres14"

  parameter {
    name  = "log_min_duration_statement"
    value = 100
  }

  parameter {
    name         = "rds.logical_replication"
    value        = 1
    apply_method = "pending-reboot"
  }

  parameter {
    name         = "shared_preload_libraries"
    value        = "pg_stat_statements,pglogical"
    apply_method = "pending-reboot"
  }
}

resource "random_string" "db_username" {
  length  = 10
  special = false
  numeric = true
}

resource "random_password" "db_password" {
  length  = 20
  special = false
}

resource "aws_security_group" "webapp_api" {
  name   = "webapp_api-${random_id.identifier.hex}"
  vpc_id = data.aws_vpc.default.id

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group" "allow_postgres" {
  name = "postgres-${random_id.identifier.hex}"

  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.webapp_api.id]
  }
}

resource "aws_db_instance" "webapp_primary" {
  identifier_prefix                   = "webapp-primary-"
  apply_immediately                   = true
  backup_retention_period             = 14
  deletion_protection                 = false
  engine                              = "postgres"
  engine_version                      = "14.3"
  instance_class                      = "db.t4g.micro"
  multi_az                            = false
  db_name                             = "thedatabase"
  storage_type                        = "gp2"
  storage_encrypted                   = true
  allocated_storage                   = 10
  max_allocated_storage               = var.webapp_primary_max_allocated_storage
  username                            = random_string.db_username.id
  password                            = random_password.db_password.result
  iam_database_authentication_enabled = true
  parameter_group_name                = aws_db_parameter_group.webapp-postgres14.name
  publicly_accessible                 = false
  performance_insights_enabled        = true
  vpc_security_group_ids = [
    aws_security_group.allow_postgres.id
  ]
  final_snapshot_identifier = "webapp-primary"
  skip_final_snapshot       = true
}

output "database_connection_parameters" {
  value = {
    DB_NAME : aws_db_instance.webapp_primary.db_name,
    DB_USERNAME : aws_db_instance.webapp_primary.username,
    DB_HOSTNAME : aws_db_instance.webapp_primary.address
  }
}

The debug log (don't panic seeing any username/password in here anyone as these values are generated and are no longer valid for anything): https://gist.github.com/jkburges/cf2083fb0b5f600bd2e73fcddae353af

This is what I see in the RDS console just after terraform apply finished (note the status and lack of any value for endpoint):

image

Terraform and provider versions:

$ terraform -v                                                                     
Terraform v1.4.2
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v5.2.0
+ provider registry.terraform.io/hashicorp/random v3.5.1

Your version of Terraform is out of date! The latest version
is 1.5.0. You can update by downloading from https://www.terraform.io/downloads.html