Open sturdiva opened 5 years ago
Hit this exact problem right now. Had to do an emergency volume growth on a running system, but now I can't retroactively update the cloudformation stack to match the new volume config since it wants to trigger an instance replacement.
Not that it resolves the issue above, and it is a bit involved, but you could use the recently released import feature to a) change the retention policy of the instance to Retain first, b) delete the instance from the stack (but retaining it), and c) reimporting it into the stack. It should have the effect of syncing up with your change (or, said another way, remediating the configuration drift). More info on the import feature here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resource-import.html - and, you could use https://former2.com/ to help generating the required template snippet. Worth experimenting, but do it outside of production first :)
Great suggestion @luiseduardocolon I’ll try it out in a separate account
Based on initial testing I can confirm that the workaround from @luiseduardocolon using a resource import works.
@PatMyron @luiseduardocolon - Do you have a chance to look at this? - Our team has been constantly challenged and overworked by this limitation of BlockDeviceMapping: VolumeSize via CFN-change triggers an EC2 instance replacement.
Could you possibly already classify this on the CloudFormation Roadmap? In case you like to get more details, please let me know. We would love to share our experience. Thanks!
I just stumbled upon this while looking for something else, and I can't believe this is how it works. There also seems to be no indication of this behaviour in the docs, where Ebs.VolumeSize (for example) says "Update requires: No interruption".
Is there a workaround for this using just CloudFormation? e.g. Will creating a Volume and attaching it to the Instance create a similar setup but without the recreate-on-resize behaviour?
The docs are un-clear here, if you look at BlockDeviceMappings
it will show "Some interruptions", but the paragraph above has:
After the instance is running, you can modify only the DeleteOnTermination settings of the attached EBS volumes.
Changing Iops
should be included as well.
We need this feature, too. +1
Having the same issue with the root volume. CFT is trying to replace the EC2 instance. Any timelines on fix please.
Changing Throughput
(for gp3
) should not trigger a replacement either. But that is, when Throughput
is supported on AWS::EC2::Instance Ebs
(https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-blockdev-template.html) first. See #824
It's been waiting to fix since 2019. Please may I know if there is any update or timelines. Thanks.
It would be very helpful if the docs were updated to reflect the current behavior until this is implemented. Even if we include the "Some Interruptions" annotation on BlockDeviceMappings
, it still doesn't cover what actually happens, which is instance replacement.
I read "Some Interruptions" on BlockDeviceMappings
as "adding or removing mappings may temporarily interrupt the instance", which is not the case.
Absolutely unacceptable that this is not at least documented in the CloudFormation spec. This could lead to live production instances being deleted if users don't bother to run the CFN update in a pre-production environment beforehand. And it forces other users to use the retain/import workaround to be able to use CFN after the update.
We really need this fixed, we are trying to push our users to use infrastructure-as-code to make managing our servers easier. When they tell us that updating the root volume causes complete redeploys of machines, it makes all the sense in the world that they just want to use the console and ignore the infrastructure as code.
If a user submits a ticket to our IT system to increase their root volume size, and someone else trying to be helpful grabs the ticket and updates the cloudformation stack without knowing this "gotcha" -- boom, angry user
We really need this fixed, we are trying to push our users to use infrastructure-as-code to make managing our servers easier. When they tell us that updating the root volume causes complete redeploys of machines, it makes all the sense in the world that they just want to use the console and ignore the infrastructure as code.
If a user submits a ticket to our IT system to increase their root volume size, and someone else trying to be helpful grabs the ticket and updates the cloudformation stack without knowing this "gotcha" -- boom, angry user
Our workaround has been to deploy everything with cloudformation, but root volumes aren't resized with cfn - we have about 4 cli commands that run after the cfn script to decide whether the root volume should be increased or not based on some variables. That works for us because we have a deployment pipeline, but for someone who runs pure cfn in the console that's not a viable workaround.
I've reported this issue to support more than 2 years ago. I can't believe this still haven't been fixed.
Sometimes it makes me wonder.
Amazon, for god sake, could you please slow down new features and products development and instead use that time to fix stuff we really need and doesn't work. Over the years, I probably reported more than 25 problems that aren't supported or have issues with cloudformation. It makes us waste time. I absolutely love AWS, but I hate it every time I need to do a workaround, do stuff in the console instead of Cloudformation or being answered "It doesn't work in cloudformation, but you can do a lambda function in cloudformation that will do what you want"... I just want cloudformation to work, is that too much to ask when we pay thousands of dollars per month for the service ?
Please fix these things
Any updates?
Hi there,
Any update on this please. Its been nearly 3 years this case opened.
This one really hurts a lot when we try to show people the benefits of the Cloud, IaC and good practises.
+1 we also need this feature
+1 Waiting for this too.
+1 CF and console consistency would be great!
Any updates or workarounds on this issue? We would really like to be able to resize an EBS volume though cloudformation without a replacement.
@wcoleman
Any updates or workarounds on this issue? We would really like to be able to resize an EBS volume though cloudformation without a replacement.
You can create a separate AWS::EC2::Volume
resource and then a AWS::EC2::VolumeAttachment
resource instead of embedding the volume definitions in the instance definition. The Volume
type supports resizing without interruption. I've only used this for non-root volumes, so not sure if there are any gotchas when trying to use this approach for root volumes.
@wcoleman
Any updates or workarounds on this issue? We would really like to be able to resize an EBS volume though cloudformation without a replacement.
You can create a separate
AWS::EC2::Volume
resource and then aAWS::EC2::VolumeAttachment
resource instead of embedding the volume definitions in the instance definition. TheVolume
type supports resizing without interruption. I've only used this for non-root volumes, so not sure if there are any gotchas when trying to use this approach for root volumes.
This will work for non-root volumes (and is exactly the process we make use of), but does not work for root volumes, which must be specified (and managed) via the BlockDeviceMappings
section of the AWS::EC2::Instance
Was just looking at the docs today, and apparently they have added a warning about the replacement, which is definitely an improvement from the first time I ran into this 🤷♀️.
Terraform manages to resize root volumes without terminating the instance or having to create a new volume. I'm trying to migrate to CloudFormation, but issues like this are surprising considering this is the native IaC solution for AWS.
I have discovered, that even if an instance is stopped, updating the VolumeSize will cause the entire EC2 to be updated. This needs to be addressed.
yeah, my team also moved to terraform. good bye cfn.
AWS Is sometime pathetic... 2 years and counting... For such an important issue. It's Ironic Terraform handles this well when AWS cannot even get there own things working
I'm hitting this now as well using CDK. I too had to do an emergency resize of the volume - CDK can't do it so it was done manually. Now CDK is out of sync. I'm trying to test out the import procedure noted above after creating a whole new test stack.
We recently started to lean in to CDK instead of Terraform - not a good experience in this particular case.
The fun part about this is that CDK's story for how to resolve drift is... very bad.
+1 just ran into this and had to debug one delta at a time in the CFT until I was left with just the root volume resize which can be easily done outside of CFTs without taking the system down :(
Never worked for AWS. So this is only my speculation.
The fundamental problem here is that logical id only exists for the top-level resources (i.e. keys for the "Resources"). https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resources-section-structure.html
Whenever cloud formation successfully creates a stack, it keeps track of the physical IDs (ARNs) created and associated logical IDs in its own database. This mapping is required for the engine to calculate the difference in subsequent updates as well as detect any drift. When an EBS volume is defined inside the BlockDeviceMapping, that EBS volume doesn't have a logical id anymore simply because it's not a direct child of the "Resource", hence there is mapping available. Consequently, there is no way in future updates for the engine to tell which EBS volume is associated with what is defined in the CFN template. (BlockDeviceMapping is a list, there is no way to tell the exact changes that have been made without some assumptions, For example, if the engine sees two EBS on the template but there is only one actually exists on the system, which one is this one?)
A sensible approach to fix this problem is to make DeviceBlockMapping a key/value dictionary instead of a list. Each volume can then be logically identified. But that's probably not a completely backward-compatible change. Also, I suspect the assumption of logical Ids are the top-level "Resource" keys is deeply engraved, and nobody what to take the responsibility to make such a large change internal.
This is a major problem.
Yes, there are AWS::EBS::Volume and AWS::EBS::AWS::EC2::VolumeAttachment. But
On a "positive" note, it seems Terraform got the exactly same problem. From Terraform's documentation
Currently, changes to the ebs_block_device configuration of existing resources cannot be automatically detected by Terraform. To manage changes and attachments of an EBS block to an instance, https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/instance#ebs-ephemeral-and-root-block-devices
The difference is that Terraform chooses to ignore any difference rather than replace the instance. So well, if your biggest competitor has the same problem then...
+1 Would really like this feature rather than a manual change.
+1 really need this..
We need this too
+1 need this too
+1. It's an important feature.
Upsizing root ebs volume in AWS console doesn't replace the EC2 instance at all. I guess AWS could do its magic behind when we upsize the root volume thru cloudformation or in a cdk manner.
It's absurd that this flawed, poor design hasn't been addressed in almost 4 years. People blindly trusting CDK to accomplish stuff via pipelines are in for a rude awakening. The CDK drift correct process is risky and overly complicated. Everything about this is bad.
+1 need this too
+1 Please urgently address this issue; it is unexpectedly breaking entire application deployments!
1. AWS::EC2::Instance -BlockDeviceMapping-Ebs should allow volume changes without replacing the instance
2. Scope of request
It should be possible to change EBS volume attributes (such as
VolumeSize
orVolumeType
) for volumes specified in theBlockDeviceMappings
property of anAWS::EC2::Instance
resource without re-creating the instance. For example, being able to re-size (via CloudFormation) the root volume of an instance.This is currently possible via the API/Console, but not via CloudFormation.
3. Expected behavior
On modification of (at least the
VolumeSize
andVolumeType
properties) theAWS::EC2::Instance
resource should not be re-created, just the underlying volume properties modified.4. Suggest specific test cases
Common use case: pass
VolumeSize
andVolumeType
parameters as a string, or as a !Ref5. Helpful Links to speed up research and evaluation
https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_ModifyVolume.html
6. Category (required) - Will help with tagging and be easier to find by other users to +1