Open nicksbrandon opened 3 years ago
@nicksbrandon Thanks for reporting this. I can confirm the behavior you describe here.
I'm investigating this.
@nicksbrandon It seems your use-case surfaced a missing validation on the creation of replication group that has cluster mode enabled. The missing validation incorrectly allows the creation of a cluster without replicas but with auto-failover enabled. The service team is aware of the problem and already looking to fix the issue. While that's not resolved, please make sure to add replicas when creating an auto-failover enabled cluster.
I understand your intention was not to use replicas, but when auto-failover is enabled, this is actually a requirement.
The use case described here actually represents an illegal state that should have never been deployed. Deployment succeeded because of a missing validation on the Elasticache service API. Keeping this issue here so we can update and resolve when a fix is available.
@iliapolo,
Many thanks for your response. It's great to hear this has uncovered a tangential issue in Elasticache.
To be clear on a point: We were unable to set automaticFailoverEnabled: false
if redis is cluster mode is enabled, and it's important we use redis in cluster mode so that we can easily scale horizontally. This is a use case that is fully supported by Elasticache so we were expecting the same from CDK.
For reference this issue (using automaticFailoverEnabled: false
) is only reported at cdk deploy
stage.
The message is
[redis name] Redis with cluster mode enabled cannot be created with auto failover turned off
@nicksbrandon Thats right, when cluster mode is enabled you must set automaticFailoverEnabled
to true
, which in turn means you must enable replicas as well.
I'm not really sure what you mean by:
This is a use case that is fully supported by Elasticache so we were expecting the same from CDK
It seems that your desired configuration is basically not supported by Elasticache, evidenced by the acknowledgment of the missing validation. I think CDK (and CloudFormation) surface this problem more frequently because of its mode of operation, invoking a full update request on every resource change. When using direct CLI invocation the issue may not happen, depending on exactly which parameters you pass.
But again this is all rooted in a faulty configuration that shouldn't have been allowed to be created in the first place.
@iliapolo
Thanks for your response.
To clarify: where carrying out the same action manually (via the console) and via CDK leads to differing results on the same Redis cache:
Scenario
I want to add a tag to a Redis cache with a single node and no failover (automaticFailoverEnabled: false
) ..
2 Options:
1) Manually - The node is tagged - no interruption to service. Success.
2) Via CDK - It appears to attempt to take the node offline in order to carry out this action. Given there is no failover then this action fails.
I hope it is clear how these two approaches, to carry out the same action, yield different results. Many thanks.
@iliapolo
What we would like to do is deploy a Redis cluster (i.e. cluster mode enabled = true) with N nodes, disable failover (automaticFailoverEnabled: false) all via CDK. We are able to do this via the AWS console and can subsequently make any number of changes to such a cluster (e.g. editing billing) tags without taking the cluster offline, i.e. the automatic failover is not required, replicas are not required.
It sounds like you might be saying that even this manual workflow should not be possible in the AWS console? If you could confirm/elaborate that would be appreciated.
Regards, Marc
@MarcFletcher Yes it does sound like the console should not have allowed this configuration as well. I'll let @NGL321 Follow up.
This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.
Yea, this is still a known issue for us.
We are unable to make simple changes, for example editing billing tags on Redis cache nodes via CDK, when the same change is possible via the AWS Console and CLI. To allow for easy horizontal scaling of our Elasticache Redis clusters, i.e. by adding more shards, we have cluster mode enabled. As we tolerate cold caches and performance of a single node per shard is sufficient we do not deploy read replicas. When we attempt to change the billing tags via CDK we receive the error “Replication group must have at least one read replica to enable autofailover”.
Reproduction Steps
Here is the test code (I have obscured subnets etc. in the sample)
I can deploy the first part without issue. However when I add the Tagging code ..
.. and attempt to redeploy I get an error.
I can tag the redis cluster using the aws cli without any issue:
I can also manually tag in the AWS console.
What did you expect to happen?
I expected the cache to be tagged without service interruption similar to when using the CLI
What actually happened?
I got the following error when I added the Tagging code and attempted to redeploy.
This issue is not identified with synth or in diff.
Environment
I have tried with CDK 1.73.0 and 1.90.1 - Same result.
Here is the package.json file from the test
Other
I would like to provision a Redis cache with cluster mode enabled so we can modify the number of shards if later required. That was straightforward to deploy initially as a single shard. However I then modified the CDK to tag the Redis cache with tags. This change failed to apply, complaining that it was unable to failover (only one node in the replication group). I could resolve this issue by adding read replicas to each shard but the function of this cache does not require a failover node and the additional cost is then undesirable. Tagging the node similarly outside of CDK does not require read replicas or for the node to be taken out of service.
I experimented with a Redis cache (Cluster mode disabled). I can tag the cache with a subsequent modification to the CDK project. However I cannot then add shards and, in the event I needed to scale, I would need to destroy the cache and create as cluster mode enabled. This is again not ideal.
I understand that a single shard without a read replica cannot support failover but it is not clear why CDK code demands failover for tagging the cache. Within the AWS console I can easily tag the nodes without any interruption to service.
If you could please advise what CDK configuration I am missing to make this possible that would be appreciated.
This is :bug: Bug Report