aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.5k stars 3.84k forks source link

(dynamo-db): Error when adding a new GSI with auto-scaling to an existing DynamoDB table with more than one replica regions via CDK #23217

Open samaneh-utter opened 1 year ago

samaneh-utter commented 1 year ago

Describe the bug

Creating this new issue to continue the discussion in this closed issue. Thank you for the detailed investigation done earlier in this comment. I respectfully disagree with this comment because I was able to add a new GSI with auto-scaling settings to an existing DynamoDB table with one or more replica regions using "AWS::DynamoDB::GlobalTable" or CfnGlobalTable. I think the current behavior by Table is a bug.

Expected Behavior

Step 1- Create a new DynamoDB table with one or more replica regions, provisioned capacity mode, and auto-scaling enabled Step 2- Update the existing DynamoDB table with a new GSI which has its own auto-scaling settings should not fail

Current Behavior

Scenario 1 - success Step 1: Create a table with one or more replica regions, and also a GSI with its own auto-scaling settings

Scenario 2 - success Step 1: Create a table with only one replica region Step 2: Update the existing table by adding a GSI with its own auto-scaling settings

Scenario 3 - success Step 1: Create a table with only one replica region Step 2: Update the existing table by adding a GSI with its own auto-scaling settings Step 3: Update the table by adding another replica region

Scenario 4 - fail Step 1: Create a table with only one replica region Step 2: Update the existing table by adding a GSI with its own auto-scaling settings Step 3: Update the table by adding another replica region Step 4: Update the existing table by adding the second GSI with its own auto-scaling settings to this existing table with two replica regions Error message: table/<table_name>/index/<new_index_name>|dynamodb:index:WriteCapacityUnits|dynamodb already exists

Scenario 5 - success Step 1: Create a table with more than one replica regions Step 2: Update the existing table by adding a GSI without its own auto-scaling settings (inherits the auto-scaling settings of the table)

Scenario 6 - fail Step 1: Create a table with more than one replica regions Step 2: Update the existing table by adding a GSI without its own auto-scaling settings (inherits the auto-scaling settings of the table) Step 3: Update the existing GSI with its own auto-scaling settings Error message: table/<table_name>/index/<new_index_name>|dynamodb:index:WriteCapacityUnits|dynamodb already exists

Reproduction Steps

Step 1- Create a new DynamoDB table with two replica regions, provisioned capacity mode, and auto-scaling enabled using the code below Step 2- Update the existing DynamoDB table with a new GSI which has its own auto-scaling settings by removing the commented code for step 2

import software.amazon.awscdk.services.dynamodb.*;
import software.constructs.Construct;
import software.amazon.awscdk.Stack;
import software.amazon.awscdk.StackProps;
import software.amazon.awscdk.services.dynamodb.*;

import java.util.List;

public class CdkStack extends Stack {
  public CdkStack(final Construct scope, final String id) {
    this(scope, id, null);
  }

  public CdkStack(final Construct scope, final String id, final StackProps props) {
    super(scope, id, props);

    // create DynamoDB table - step 1
    Table globalTable = Table.Builder.create(this, "TestTable")
        .tableName("TestTable")
        .partitionKey(Attribute.builder().name("pk").type(AttributeType.STRING).build())
        .sortKey(Attribute.builder().name("sk").type(AttributeType.STRING).build())
        // test with one or more replica regions
        .replicationRegions(List.of("eu-north-1", "eu-west-1")) 
        .billingMode(BillingMode.PROVISIONED)
        .build();
    globalTable.autoScaleWriteCapacity(EnableScalingProps.builder()
            .minCapacity(10)
            .maxCapacity(50)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(75)
            .build());
    globalTable.autoScaleReadCapacity(EnableScalingProps.builder()
            .minCapacity(10)
            .maxCapacity(50)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(70)
            .build());
    // Optional
    // to show that it works to create a GSI to a table with one or more replica regions during the table creation
    // adding a GSI with its own auto-scaling settings 
    GlobalSecondaryIndexProps globalSecondaryIndexProps = GlobalSecondaryIndexProps.builder()
        .indexName("TestIndex1")
        .partitionKey(Attribute.builder().name("sk").type(AttributeType.STRING).build())
        .sortKey(Attribute.builder().name("pk").type(AttributeType.STRING).build())
        .projectionType(ProjectionType.KEYS_ONLY)
        .build();
    globalTable.addGlobalSecondaryIndex(globalSecondaryIndexProps);
    globalTable.autoScaleGlobalSecondaryIndexReadCapacity("TestIndex1", EnableScalingProps.builder()
            .minCapacity(5)
            .maxCapacity(50)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(75)
            .build());
    globalTable.autoScaleGlobalSecondaryIndexWriteCapacity("TestIndex1",EnableScalingProps.builder()
            .minCapacity(5)
            .maxCapacity(50)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(75)
            .build());

/*
     // Step 2 - adding a GSI to the existing table 
    // this step fails when the table has more than one replica region
    GlobalSecondaryIndexProps globalSecondaryIndexProps2 = GlobalSecondaryIndexProps.builder()
        .indexName("TestIndex2")
        .partitionKey(Attribute.builder().name("sk").type(AttributeType.STRING).build())
        .sortKey(Attribute.builder().name("pk").type(AttributeType.STRING).build())
        .projectionType(ProjectionType.KEYS_ONLY)
        .build();
    globalTable.addGlobalSecondaryIndex(globalSecondaryIndexProps2);
    globalTable.autoScaleGlobalSecondaryIndexReadCapacity("TestIndex2", EnableScalingProps.builder()
            .minCapacity(5)
            .maxCapacity(70)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(80)
            .build());
    globalTable.autoScaleGlobalSecondaryIndexWriteCapacity("TestIndex2",EnableScalingProps.builder()
            .minCapacity(5)
            .maxCapacity(50)
            .build())
        .scaleOnUtilization(UtilizationScalingProps.builder()
            .targetUtilizationPercent(70)
            .build());
*/
  }
}

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.53.0

Framework Version

No response

Node.js Version

v19.0.1

OS

macOS

Language

Java

Language Version

No response

Other information

No response

peterwoodworth commented 1 year ago

Given the last two comments on the linked issue both comment on how this is a service issue rather than a CDK issue, I'm inclined to believe this is still the case. Additionally, as seen in that issue, the same CDK code succeeding or failing depending on the current state of the stack (i.e. update or create) very much seems to indicate this is a service team bug. Templates are declarative and shouldn't rely on previous state - making this a Cfn bug.

I respectfully disagree with https://github.com/aws/aws-cdk/issues/19083#issuecomment-1081273369 because I was able to add a new GSI with auto-scaling settings to an existing DynamoDB table with one or more replica regions using "AWS::DynamoDB::GlobalTable" or CfnGlobalTable.

Can you expand upon this with a full example please?

samaneh-utter commented 1 year ago

Please find the attached examples.zip:

  1. creating a GlobalTable with two replica regions and a GSI using CloudFormation (cfn-step-1.json)
  2. adding a new GSI to an existing GlobalTable with two replica regions and an existing GSI (cfn-step2.json)
  3. creating a GlobalTable with two replica regions and a GSI (CdkCfnStack.java)

As I checked Table, it uses CfnTable, and there is a custom resource which is used for adding a replica region. I suggest the Table construct be based on CfnGlobalTable instead of CfnTable. Using CfnGlobalTable allows to create a table with single or multiple replica regions. Also, a new GSI can be added with its own auto-scaling settings to an existing table.

Hope these examples clarify. If there is any other questions, please let me know.

peterwoodworth commented 1 year ago

Yes, this does clear things up. Thanks for the elaboration 🙂

Since you're making use of the CfnGlobalTable, that would be considered a workaround to what is at root a CloudFormation/DynamoDB issue. This should work with our L2 Table construct, it's just that CloudFormation rejects it for some reason. Rico's comment in the previous issue describes exactly why the error is occurring and I've confirmed that on my end as well.

Given this, I will keep this issue open for tracking and so that people can know there's a workaround with CfnGlobalTable. I've reported this to CloudFormation internally, will provide updates when they become available P76996364

rix0rrr commented 1 year ago

This is not a needs-cfn issue. AWS::DynamoDB::Table has a bug which has been fixed in AWS::DynamoDB::GlobalTable. We will need to add support for GlobalTable.

rix0rrr commented 11 months ago

This issue was for the existing Table construct, which used custom resources to implement table replication. We no longer recommend the use of the Table construct.

Instead, the TableV2 construct has been released in 2.95.1 (#27023) which maps to the AWS::DynamoDB::GlobalTable resource, has better support for replication and does not suffer from the issue described here.


Be aware that there are additional deployment steps involved in a migration from Table to TableV2. You need to do a RETAIN deployment, a delete deployment, then change the code to use TableV2 and then use cdk import. A link to a full guide will be posted once it is available.

Here are some other resources to get you started (using CfnGlobalTable instead of TableV2) if you want to get going on the migration: