resource/aws_elasticache_replication_group: Migrate from availability_zones TypeSet attribute to preferred_availability_zones TypeList attribute

moose007 commented 6 years ago

This bug is very similar to https://github.com/terraform-providers/terraform-provider-aws/pull/4741 but is for the other ElastiCache resource ("aws_elasticache_replication_group" vs "aws_elasticache_cluster")

provider "aws" {
    version = ">= 1.23.0"
    region = "${var.region}"
    profile = "${var.profile}"
}

I'm trying to create 2 primary shards with 1 read replica each.

cache_node_type = "cache.m4.large"
cache_node_groups = 2 
cache_replicas_per_node_group = 1
cache_availability_zones = ["us-west-2a", "us-west-2b"]]
cache_automatic_failover_enabled = true

resource "aws_elasticache_replication_group" "default" {
  engine                                          = "redis"
  engine_version                                  = "3.2.6"
  replication_group_id                            = "cache-env"
  replication_group_description                   = "some description"
  node_type                                       = "cache.m4.large"
  port                                            = 6379
  parameter_group_name                            = "${aws_elasticache_parameter_group.default.id}"
  subnet_group_name                               = "${aws_elasticache_subnet_group.default.name}"
  availability_zones                              = ["${var.cache_availability_zones}"]
  security_group_ids                              = ["${aws_security_group.elasticache.id}"]
  transit_encryption_enabled                      = true
  automatic_failover_enabled                      = "true"

  cluster_mode {
    replicas_per_node_group                       = "${var.replicas_per_node_group}"
    num_node_groups                               = "${var.node_groups}"
  }

  tags {
    Name = "cache-${var.env}"
  }
}

resource "aws_elasticache_parameter_group" "default" {
  name   = "cache-params-${var.env}"
  family = "redis3.2"

  parameter {
    name  = "cluster-enabled"
    value = "yes"
  }
}

I got the following error when terraform applying:

Error creating Elasticache Replication Group: InvalidParameterCombination: PreferredCacheClusterAZs can only be specified for one node group.

Note, by excluding the availability_zones attribute in the resource, you can workaround this and deploy a multi-shard Redis cluster with read replicas across multiple AZs.

bflad commented 6 years ago

There are two separate feature requests here, so we should clarify this issue as one or the other (creating a separate issue as necessary).

This error is returning from the Elasticache API:

Error creating Elasticache Replication Group: InvalidParameterCombination: PreferredCacheClusterAZs can only be specified for one node group.

The reason this is occurring is because the availability_zones argument is not compatible with Redis Cluster Mode Enabled replication groups where there is more than 1 shard -- in the Elasticache SDK, this is the full documentation for the parameter that availability_zones sets:

    // A list of EC2 Availability Zones in which the replication group's clusters
    // are created. The order of the Availability Zones in the list is the order
    // in which clusters are allocated. The primary cluster is created in the first
    // AZ in the list.
    //
    // This parameter is not used if there is more than one node group (shard).
    // You should use NodeGroupConfiguration instead.
    //
    // If you are creating your replication group in an Amazon VPC (recommended),
    // you can only locate clusters in Availability Zones associated with the subnets
    // in the selected subnet group.
    //
    // The number of Availability Zones listed must equal the value of NumCacheClusters.
    //
    // Default: system chosen Availability Zones.
    PreferredCacheClusterAZs []*string `locationNameList:"AvailabilityZone" type:"list"`

So to explicitly configure availability_zones for the same availability zone multiple times for Redis Cluster Mode Disabled or single shard replication groups, the attribute does need to be migrated similar to how preferred_availability_zones was for the aws_elasticache_cluster resource.

For Redis Cluster Mode Enabled replication groups (e.g. when using cluster_mode in Terraform), we currently do not have the ability to set availability zones via the NodeGroupConfiguration parameter, which would likely require changing the cluster_mode argument.

moose007 commented 6 years ago

Thanks @bflad for that insight. Apparently I borrowed the availability_zones parameter from the aws_elasticache_cluster resource since it's not listed in attribute references documentation for elasticache_replication_group.

I created a second feature request (https://github.com/terraform-providers/terraform-provider-aws/issues/5118) for supporting setting AZ in NodeGroupConfiguration, I personally do not need it at the moment, but I imagine it would be nice to have in some situations.

IMO, how "cheap" it would be to port over to port over what you did for aws_elasticache_cluster to this resource, would determine whether it should be done or not. I don't personally need it, and unfortunately I don't have any extra cycles to create a PR myself for it.

HudsonAkridge commented 6 years ago

We're also running into this issue.

rinkymangal2010 commented 6 years ago

If we setup Redis Cluster (Cluster mode enabled) using Terraform, then currently terraform do not have ability to set single availability zone while it is doable directly via AWS console.

It will be great if this can be done by terraform. Because if we create cluster nodes in same availability zone(Same as server) then we can save data transfer cost.

grjones commented 5 years ago

Is there any work-a-round for this?

gdavison commented 3 years ago

As @bflad noted above, this issue combines two separate feature requests. I've created #18438 to add a corrected TypeList parameter preferred_cache_cluster_azs and deprecate availability_zones. The issue #5118 covers specifying availability zones when "Cluster Mode" is enabled.

Since we now have two specific issues to track the work separately, I'm going to close this issue.

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

hashicorp / terraform-provider-aws

resource/aws_elasticache_replication_group: Migrate from availability_zones TypeSet attribute to preferred_availability_zones TypeList attribute #5104