dougmoscrop / serverless-plugin-split-stacks

A plugin to generate nested stacks to get around CloudFormation resource/parameter/output limits
297 stars 68 forks source link

Discussions on custom strategy #86

Open vicary opened 4 years ago

vicary commented 4 years ago

Not sure if this is a right place for discussions, please let me know if there are other places for the purpose.

We have been testing on a few custom strategies since the hitting of 200 and 60 limits on various places. So we made a simple counting logic that splits off sub-stacks with a sequential suffix for every 50 items. e.g. LogGroup0, LogGroup1... etc.

But it appears that new stack items, in our case one of the log groups, cannot be migrated between existing groups. Errors prompting about the same log group already exists in earlier groups.

It appears that a good strategy should be either,

  1. Always assign the same logical ID to the same destination group, but that may still hits the 60 args limit.
  2. Somehow knows if a stack item already exists in a sub-stack, so it knows when to assign new items to some other sub-stacks and keep old ones untouched.

Just want to know if I am heading to a right direction, could really use some advices here.

dougmoscrop commented 4 years ago

Yes, that is how this plugin is supposed to work - it first checks if a resource has already been migrated before, and if it has, it keeps that migration. You can use force: true on a custom migration, but that can still fail, cause data loss, etc. and so has to be done carefully.

dougmoscrop commented 4 years ago

Oh, it does so using logical IDs, so if this is happening:

first deploy:

someResourceABCDEFKAJSDKAJZ:
  name: 'foo'

second deploy:

someResourceA1234718237128:
  name: 'foo'

That would fail, if name has to be unique; but that would fail with normal CFn too, not just this plugin.

vicary commented 4 years ago

Yes, diving into core logics showed the code tries really hard to do what we mentioned.

My team failed to figure out what edge case changes the log group to another sub-stack, we ended up doing a whole backup-restore-redeploy process which took two whole days of downtime.

We are now sticking with the default splitting logic and every time we add more resources, we pray.

dougmoscrop commented 4 years ago

I'm sorry to hear about your downtime! :(

I really wish there was an easier way to work with all this, in fact I believe this should entirely be internal logic to CloudFormation and not something AWS users have to put up with..

I recommend using my bootstrap plugin to manage your data tier separately from your app tier, we try to make it so we can destroy/recreate our application layer in minutes (it would still be down, which would suck, but easy to recover and no data loss)

vicary commented 4 years ago

True, the fact that SAM model and CloudFormation is loosely coupled to its “supporting” products creates quite a few headaches.

I must thank you for the effort in making our lives easier, will definitely look into the bootstrap plugin shortly!

purefan commented 4 years ago

Hope Im not hijacking the thread but I am also working with a custom strategy upon an already deployed stack, I started with a very simple { destination: 'testing_custom_stack' }, ran DEBUG=* serverless package and got something like this:

Serverless: [serverless-plugin-split-stacks]: Summary: 317 resources migrated in to 125 nested stacks
Serverless: [serverless-plugin-split-stacks]:    Resources per stack:
Serverless: [serverless-plugin-split-stacks]:    - (root): 125
Serverless: [serverless-plugin-split-stacks]:    - ApiGatewayMethodChatRoomGetNestedStack: 1
Serverless: [serverless-plugin-split-stacks]:    - ApiGatewayMethodChatRoomOptionsNestedStack: 1
Serverless: [serverless-plugin-split-stacks]:    - ApiGatewayMethodGitlabQueueOptionsNestedStack: 1
Serverless: [serverless-plugin-split-stacks]:    - ApiGatewayMethodGitlabQueuePostNestedStack: 1
...
Serverless: [serverless-plugin-split-stacks]:    - testing_custom_stackNestedStack: 49

I dont really understand the -(root) stack, is it the same situation as with the other stacks? that since those resources were already migrated they retain their current stack?

dougmoscrop commented 4 years ago

Root is just the default, so either the resources were new and did not have a migration applied or they were preexisting.