aws-cloudformation / cloudformation-coverage-roadmap

The AWS CloudFormation Public Coverage Roadmap
https://aws.amazon.com/cloudformation/
Creative Commons Attribution Share Alike 4.0 International
1.11k stars 54 forks source link

AWS::DynamoDB::Table GlobalSecondaryIndex Support for Multiple GSIs #229

Open thesilentg opened 4 years ago

thesilentg commented 4 years ago

2. Scope of request

Cloudformation cannot perform more than one DynamoDB GSI creation or deletion in a single update. Instead, users must add/delete a single GSI per update, and then manually create a new update for each GSI which needs to be added. For teams which have Cloudformation run as part of a deployment pipeline, this is even more bothersome as it necessitates waiting for multiple end-to-end runs of the entire deployment pipeline for each GSI that needs to be created/deleted.

Furthermore, once a Cloudformation stack includes a DynamoDB table with more than one GSI, it is no longer possible for someone to re-use this stack without manually editing it and going though the one-at-a-time GSI update process. This issue seems to be in direct contrast with the advertised goals of CloudFormation.

https://aws.amazon.com/cloudformation/

AWS CloudFormation provisions your resources in a safe, repeatable manner, allowing you to build and rebuild your infrastructure and applications, without having to perform manual actions or write custom scripts.

3. Expected behavior

  1. Update operations to existing DynamoDb tables succeed even when adding/deleting multiple GSIs
  2. Creating a new DynamoDb table using an Cloudformation stack that includes multiple GSIs succeeds

4. Suggest specific test cases

  1. Creating a new DynamoDb table with two GSIs is able to complete successfully
  2. Adding two GSIs in a single update to an existing table is able to complete successfully
  3. Deleting a DynamoDb table with two GSIs is able to complete successfully
  4. Removing two GSIs in a single update to an existing table is able to complete successfully

5. Helpful Links to speed up research and evaluation

https://stackoverflow.com/questions/28402003/creating-multiple-gsis-by-updatetable-dynamodb https://stackoverflow.com/questions/47920665/how-to-delete-more-than-one-global-secondary-index-from-the-cloudformation-at-on Managing Global Secondary Indexes documentation

6. Category

DB: DynamoDb

PatMyron commented 4 years ago

Creating and deleting multiple DynamoDB tables with multiple Global Secondary Indexes in one operation is possible in CloudFormation already:

ExampleTemplate.yaml


Difficult to guarantee multiple Global Secondary Index additions in one CloudFormation update for any existing DynamoDB table because:

1) CloudFormation uses temporary credentials to provision resources 2) DynamoDB Global Secondary Index creation for existing tables is performed serially 3) DynamoDB Global Secondary Index creation time has no guarantees:

Adding a Global Secondary Index to a Large Table The time required for building a global secondary index depends on several factors, such as the following:

  • The size of the table
  • The number of items in the table that qualify for inclusion in the index
  • The number of attributes projected into the index
  • The provisioned write capacity of the index
  • Write activity on the main table during index builds

If you are adding a global secondary index to a very large table, it might take a long time for the creation process to complete.


GSI deletion time should be faster, so possible for CloudFormation to add support for multiple GSI operations on existing tables if one or fewer of the operations are GSI creations

benkehoe commented 4 years ago

Will the new resource framework help solve this, given that it has dedicated support for long-running operations?

PatMyron commented 4 years ago

New framework still has time limits for the foreseeable future, right @rjlohan? I don't think there are guarantees on GSI creation time, so that could still be problematic even with longer time limits

benbridts commented 4 years ago

if I understand it correctly, there is a PR (https://github.com/aws-cloudformation/aws-cloudformation-resource-schema/pull/61) open that would give you a maximum execution time of 12 hours (https://github.com/aws-cloudformation/aws-cloudformation-resource-schema/pull/61/files#diff-c22e96ee3efd905ffecd2b8dcad5c260R23)

rjlohan commented 4 years ago

The new framework will still have some time limit for the foreseeable future, right @rjlohan? I don't think there are any guarantees on GSI creation time, so that case could still be problematic even with a longer time limit

Correct. At some point, we are always going to be bound by the maximum session duration on credentials which CloudFormation assumes from the customer. For private types in the new framework, that maximum is 12 hours, bound by the maximum duration we can request in an sts:AssumeRole call.

For AWS-owned public types, we use a different authorization mechanism which lets us run to 36hours maximum. Anything longer, or transient, would require a new authZ model inside CloudFormation itself, which is a bigger ask. There are ways to achieve this, but will require some investment.

benkehoe commented 4 years ago

I guess when I think about the authentication part of this, which is that the user has an authenticated session that expires at some specific time, I balance that against the purpose of CloudFormation, which is to create persistent resources. It's often going to be the case that I can create resources with CloudFormation that can carry out my session's capability beyond my session expiration anyway.

luiseduardocolon commented 4 years ago

We will likely not pursue this in the short- or medium-term. The API specifically expects to perform only one single creation or deletion per update, and we believe we need to support it as the API dictates it. I'll close it for now - I believe that if we ever support this it will be if or when the underlying DynamoDB API supports it.

et304383 commented 4 years ago

This should be reopened. All AWS needs to do is create a stand-alone resource for GSIs on an existing table. Then, we could chain them together with a bunch of depends on calls.

Creating and deleting multiple DynamoDB tables with multiple Global Secondary Indexes in one operation is possible in CloudFormation already:

ExampleTemplate.yaml

I can't believe someone from the CFN team suggested this as an example. The issue is clearly about defining 2 or more GSIs on a single table and you give a template with two table resources each with one GSI.

gelpenaddict commented 4 years ago

I understand that DynamoDB's API allows only 1 GSI but isn't the purpose of CloudFormation to automate deployments, including serializing such API calls to other services?

thesilentg commented 3 years ago

Coming back to this almost a year later, after we got hit by the same issue again.

The API specifically expects to perform only one single creation or deletion per update, and we believe we need to support it as the API dictates it.

Cloudformation doesn't expose any API, it provides a declarative syntax for defining which resources you want to exist in your AWS account. I'm assuming you're talking about the Dynamo API. When I define a AWS::DynamoDB::Table in Cloudformation, the intent is not to call say Please call CreateTable for me, it is instead to say "I want a DynamoDb table to exist matching this configuration, do whatever it takes on my behalf to make it so". To users of Cloudformation, the specifics of what AWS API calls need to get invoked to spin up the resources are an internal implementation detail of Cloudformation which is deliberately abstracted from the user. If I wanted to be invoking CreateTable, I wouldn't be writing Cloudformation (I'd be calling CreateTable myself). As a declarative tool, Cloudformation makes no statements about how resources come into being, merely what the end state will look like. A DynamoDb table with multiple GSIs is a completely rational end state, therefore I believe it is Cloudformation's duty to translate my intention into existence, regardless of what internal API calls are required to make it so.

Furthermore, I believe that the validity of a Cloudformation file should be path-independent. That is to say, a given Cloudformation template should be valid (syntactically correct + able to be applied) or invalid (not syntactically correct or not able to be applied) without any ambiguity. To give an example, let's consider five different Cloudformation templates (A - E).

Cloudformation Templates:

A: Empty Template (no resources)
B: Table with no GSIs
C: Table with one GSI
D: Table with two GSIs
E: Table with three GSIs

Then lets consider the possible transitions you can take when modifying the Cloudformation template and applying the Cloudformation update:

Transitions:

A -> B: Valid
A -> C: Valid
A -> D: Valid
A -> E: Valid
B -> C: Valid
B -> D: Invalid (trying to add two GSIs)
B -> E: Invalid (trying to add three GSIs)
C -> D: Valid
C -> E: Invalid (trying to add two GSIs)
D -> E: Valid

Now let me pose the following question: Is Cloudformation template D valid or invalid? The answer is of course: "It depends on what state your AWS infrastructure is in already". If you already have a table defined (B), the Cloudformation template D isn't valid. But if you don't have any table defined in your AWS account (A) or you have a table with one GSI already (C), then Cloudformation template D is perfectly valid.

If I have to know that in order to get from current state B to end state D that I first must take a detour through state C, this is a problem.

thesilentg commented 3 years ago

It may be difficult to guarantee multiple Global Secondary Index additions in one CloudFormation update for any existing DynamoDB table because:

CloudFormation uses temporary credentials to provision resources DynamoDB Global Secondary Index creation for existing tables must be performed serially DynamoDB Global Secondary Index creation time has no guarantees:

While I understand that creating a GSI is not a bounded operation, and the credentials have a maximum timeout, why can't Cloudformation create as many GSIs as possible within the allowed time limit? Let say the maximum credential session is 12 hours. If I try to create 5 different GSIs on a table in a single cloudformation update and each one takes three hours, I'm fine with this failing. But if I try to create three GSIs, why can't all three be updated successfully? What problems does "try to create as many GSIs as possible" cause? @luiseduardocolon @PatMyron please see previous comment as well.

benbridts commented 3 years ago

What problems does "try to create as many GSIs as possible Some problems:

  • Rollbacks will be even a bigger pain.
  • Applying the same change on two stacks might cause a failure on one (because it is a big table), while it works on the other. Deployments that fail unpredictably are worse than those that always fail.

If I have to know that in order to get from current state B to end state D that I first must take a detour through state C, this is a problem. I do agree with this, but with some extra remarks:

thesilentg commented 3 years ago

Rollbacks will be even a bigger pain.

Will they? My (admittedly naive) understanding is that rollbacks (GSI deletions) will be much faster than rollouts (GSI creations). Therefore, it shouldn't be possible for a stack to get stuck in a bad state, since even if the GSI creation fails, the rollback will be able to succeed.

Applying the same change on two stacks might cause a failure on one (because it is a big table), while it works on the other. Deployments that fail unpredictably are worse than those that always fail.

I'll partially concede this point, although I kind of consider "Deployments that fail unpredictably are worse than those that always fail." to be the situation we have today. When adding additional GSIs to a Cloud formation template, a developer's first deployment will lead to an unpredicted failure until they learn about the single GSI creation restrictions.

This twitter thread by @benkehoe is very relevant (see "stage 3"):

This is useful context.

My end goal here is making it so that developers are able to reliable deploy infrastructure with the minimal amount of manual overhead. What needs to occur in order for this goal to be realized in regards to the ability to update multiple DynamoDB GSIs in a single Cloudformation template change?

Do I need to push on the Dynamo team to provide an API with support for multiple GSI updates? (that Cloudformation can later consume)

benbridts commented 3 years ago

[...] rollbacks (GSI deletions) will be much faster than rollouts (GSI creations) [...]

I forgot about the asymmetry there. It will still take a long time before a rollback starts, and It will be non-obvious that the solution is doing the change with one GSI at a time. But it is less of a pain than what I thought. I'd still be wary of possible edge cases (what happens if something else fails together with a delete of an index? ....)

Do I need to push on the Dynamo team to provide an API with support for multiple GSI updates? (that Cloudformation can later consume)

I don't work for AWS, but from https://github.com/aws-cloudformation/aws-cloudformation-coverage-roadmap/issues/229#issuecomment-573243450, that seems to be the most effective way to get this changed.

nubpro commented 3 years ago

We will likely not pursue this in the short- or medium-term. The API specifically expects to perform only one single creation or deletion per update, and we believe we need to support it as the API dictates it. I'll close it for now - I believe that if we ever support this it will be if or when the underlying DynamoDB API supports it.

Now how do we exactly push DynamoDB team to support this API?

gelpenaddict commented 3 years ago

https://aws.amazon.com/cloudformation/ Should probably also update this

CloudFormation template describes your desired resources and their dependencies so you can launch and configure them together as a stack

accordingly to something like

CloudFormation template is like a script calling other AWS APIs; it does not describe desired resources and their dependencies so they can be launched as a stack

IsaiahJTurner commented 3 years ago

I have proposed an improved solution to this issue over at https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/881 If you are experiencing this issue with DyanamoDB + CloudFormation (as the 50+ upvotes on this post are), take a look at that proposal and give it some love if it would help you too so it can be appropriately prioritized.

paul-uz commented 2 years ago

Is there any update on this being supported?

kuda1992 commented 2 years ago

I can't believe this is not fixed yet, this is certainly a deal breaker.

clouddev-code commented 2 years ago

In the case of updating multiple GSIs, the Cloudformation stack needs to be run with separate change sets, so I'd like to be able to change them all at once.

ahurlburt commented 1 year ago

Just bumping this. Lack of multiple GSI updates really hurts dev time. Current workaround is to update one, deploy, wait (takes 15-20 minutes for our build to build and deploy), repeat.

Then repeat this per environment again

The multiple environment is particularly problematic since if we have a working staging build when we go to production we now need to modify the cloud formation template (that was tested in stage) to start deleting indices and adding them back in one by one after the deploy...

It really is cumbersome. Technically it can be worked around but I started using CloudFormation assuming it was going to speed development not hinder it. It wouldn't be as big of a deal if the deploys didn't take so long.

paul-uz commented 1 year ago

I still fail to see the technical reasoning for why this can't be done? Surely there isn't one?

I'd happily forfeit a bit of additional deployment time if it meant it worked.

bfbenf commented 1 year ago

I am also facing the same issue and like @paul-uz fail to see why this cannot be handled by CloudFormation. I am using CDK to produce our CloudFormation and commenting GSI's make our tests fail in the pipeline and makes deploying Tables really cumbersome

PeterBaker0 commented 1 year ago

Adding another voice to the consensus above - this completely violates the normal CDK -> CloudFormation -> Pipeline workflow. I think the illustrative and detailed example above regarding the definition of "valid" and "invalid" is spot on - unless absolutely unavoidable, the validity of a template should not depend on the current state of the infrastructure.

wz2b commented 1 year ago

We will likely not pursue this in the short- or medium-term. The API specifically expects to perform only one single creation or deletion per update, and we believe we need to support it as the API dictates it.

You're right that the UpdateTable supports only one deletion or modification per update, but CreateTable supports up to 20. So I think something might be awry in CloudFormation.

ADanielLitify commented 1 year ago

Three and a half years later and this is still an issue. My team is using CDK to deploy our application stack, and due to the size of the team and rapid development, there are often many updates between when a single developer deploys the stack to their personal dev environment. Adding them one at a time often isn't realistic.

christophermichaelthomasmillar commented 1 year ago

The lack of support for something so fundamental really speaks against using DynamoDB with cloud formation in general. I feel like in all the cloud hype we forgot the difference between good software and bad and now we're left SOL. Like a branch Davidian with a stomach ache I'm beginning to question my life choices in joining this cargo cult.

paul-uz commented 11 months ago

@mobob you can create a table with 5 GSIs, you just can't add 4 GSIs to a table that already has one, in one go. Utter nonsense

mobob commented 11 months ago

@mobob you can create a table with 5 GSIs, you just can't add 4 GSIs to a table that already has one, in one go. Utter nonsense

Right - thanks for the clarification.

+1 to getting some solution that allows incremental GSI changes to dynamo db.

ahurlburt commented 11 months ago

yea this is a huge pain to do a bunch of deployments to add 1 at a time, especially since deployments take so long.

This thread has a huge amount of likes, maybe eventually a PM will take notice.

Ccz-Chen commented 10 months ago

Any update/progress on this issue? We have to run N# deployments in order to push a feature to public, if the feature uses N# of GISs because of this error message:

"Resource handler returned message: "Cannot perform more than one GSI creation or deletion in a single update" (RequestToken: 4c2ffec4-a684-dac1-0478-6cc98d09c044, HandlerErrorCode: InvalidRequest)"

BryanCrotaz commented 10 months ago

The best way for CF to fix this is to add a new resource type for Dynamo GSI that can be attached to a { Ref: DynamoTable } Then CF can know to run them sequentially, and if that will take too long for the credentials, start a step function behind the scenes that runs each one, then the rest of the script.

soplan commented 6 months ago

Yes feeling the same pain here. We have 3 different stages. Usually on Dev we deploy multiple times a day so we are not hindered by this issue. But once we are done testing and ready to merge to staging we are stuck because of this issue. We want to do CI/CD without headaches.

leonardogazdek commented 5 months ago

Ditched CloudFormation entirely because of this. For me this pretty much defeats the whole purpose of CloudFormation with DynamoDB.

paul-uz commented 5 months ago

Ditched CloudFormation entirely because of this. For me this pretty much defeats the whole purpose of CloudFormation with DynamoDB.

What did you replace CFN with?