hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.71k stars 9.07k forks source link

[Bug]: aws_elasticache_replication_group upgrade engine version 5.0.6 to 7.1 not working when tags also needs modifications #36017

Open sairamkotapati-github opened 6 months ago

sairamkotapati-github commented 6 months ago

Terraform Core Version

1.3.7

AWS Provider Version

4.67.0, 5.26.0

Affected Resource(s)

aws_elasticache_replication_group

Expected Behavior

Changing the redis engine_version from 5.0.6 to 7.1 along with updating tags resource should be successful after running the terraform apply command

Actual Behavior

Got below error when terraform apply is run updating ElastiCache Replication Group (<replication-group-name>): InvalidReplicationGroupState: Replication group must be in available state to modify. When I run terraform plan again followed by terraform apply it works since tags are already updated already in previous attempt

Additional details

When I checked in AWS console tags are already updated on the cluster which tells me there is a race condition in terms of updating Tags and upgrading engine version.

There are two back to back CloudTrail events AddTagsToResource and ModifyReplicationGroup at the same time

Relevant Error/Panic Output Snippet

updating ElastiCache Replication Group (<replication-group-name>): InvalidReplicationGroupState: Replication group must be in available state to modify.

Terraform Configuration Files

// add some tags values in a local variable called tags

resource "aws_elasticache_replication_group" "redis" {
  replication_group_id       = "my-redis-rep-group"
  description                = "my redis cluster"
  num_cache_clusters         = 2
  node_type                  = "cache.t3.micro"
  automatic_failover_enabled = true
  engine                     = "redis"
  at_rest_encryption_enabled = true
  kms_key_id                 = local.cmk_arn
  transit_encryption_enabled = true
  auth_token                 = local.auth_token_generated
  engine_version             = "7.1"
  port                       = 6379
  parameter_group_name       = "default.redis7"
  subnet_group_name          = var.redis_subnet_group.id
  security_group_ids         = var.security_group_id
  apply_immediately          = true
  maintenance_window         = var.maintenance_window
  snapshot_window            = var.snapshot_window
  snapshot_retention_limit   = 35
  multi_az_enabled           = true

  // log delivery configuration depends on redis engine version - https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/Log_Delivery.html
  dynamic "log_delivery_configuration" {
    for_each = local.create_slow_log ? [1] : []
    content {
      destination      = aws_cloudwatch_log_group.elasticache_logs.name
      destination_type = "cloudwatch-logs"
      log_format       = "json"
      log_type         = "slow-log"
    }
  }
  dynamic "log_delivery_configuration" {
    for_each = local.create_engine_log ? [1] : []
    content {
      destination      = aws_cloudwatch_log_group.elasticache_logs.name
      destination_type = "cloudwatch-logs"
      log_format       = "json"
      log_type         = "engine-log"
    }
  }

  tags                    = local.tags
  replicas_per_node_group = null
  num_node_groups         = null
}

Steps to Reproduce

  1. Create redis cluster with cluster mode disabled using aws_elasticache_replication_group and engine_version 5.0.6
  2. Update engine_version to 7.1 and also update tags for resource aws_elasticache_replication_group
  3. This results in error
  4. Rerun terraform plan and terraform apply and it now works

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

Below two issues are relevant in the sense they are reported for being unable to update tags but for current bug that am reporting its quite opposite tags are updated first but unable to update redis engine version https://github.com/hashicorp/terraform-provider-aws/issues/23219 https://github.com/hashicorp/terraform-provider-aws/issues/35952

Would you like to implement a fix?

No

github-actions[bot] commented 6 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 5 months ago

Potentially also related to #33859

Note: I'm triaging new issues at the moment and so haven't looked into this enough to know if it's related for certain, but ☝️ that issue and the ones you'd already linked seem suspiciously similar to my eye.

sairamkotapati-github commented 4 months ago

@justinretzolk thank you, we used aws provider 5.41.0 and that definitely helped get past the original issue that I reported above. But if we try to enable log_delivery_configuration using dynamic block at the same time while upgrading engine version we are running into below issue

updating ElastiCache Replication Group (<groupName>): InvalidParameterValue: slow-log log delivery is not available for this redis engine version.

Note - If we do the engine version upgrade by itself without enabling log delivery configuration it does work fine, and then we could enable log delivery configuration but am trying to see if we could update at the same time.