lserman / capistrano-elbas

Deploy Rails apps to AWS AutoScale groups
MIT License
96 stars 65 forks source link

can aws_no_reboot_on_create_ami be set to false? #22

Open augustosamame opened 7 years ago

augustosamame commented 7 years ago

I am about to start using elbas as it looks like a great way to automate our autoscaling deployments. However, I notice that you recommend this setting:

set :aws_no_reboot_on_create_ami, true

Creating an AMI without reboot does not guarantee that the AMI will be in a consistent state. There are warnings about this throughout the AWS docs.

I would think that since this is an autoscaling group, it's really not a big deal for a single instance to go down while creating the AMI since the ELB will route traffic to the other instances in the meantime.

Any particular reason for not rebooting the instance to create the AMI with elbas?

augustosamame commented 7 years ago

So as a follow up to this question, I am indeed having issues with the no_reboot setting set to true. First deploy went fine, but the second deploy is failing with:

      02 git remote update --prune
      02 Fetching origin
      02 error: object file ./objects/0e/7793dd051ee208ff891f36729d95a4c193376b is empty
      02 error: object file ./objects/0e/7793dd051ee208ff891f36729d95a4c193376b is empty
      02 fatal: loose object 0e7793dd051ee208ff891f36729d95a4c193376b (stored in ./objects/0e/7793dd051ee208ff891f36729d95a4c193376b) is corrupt
      02 error: Could not fetch origin

It seems this error happens when git directories get corrupted, usually during crashes. I think in this case the no_reboot = true is creating these errors.

So I'm going back to my previous AMI and see if I can reproduce / fix the issue with no_reboot = false

UPDATE: setting no_reboot_setting to false fixed the corruption issue with git and seems to work fine in all other respects.

DiegoRBaquero commented 6 years ago

Thank you for this issue! my git was also getting corrupted.

twohlix commented 5 years ago

You are correct @augustosamame, it should be a straightforward PR to enable that if you would like.

We have seen that delaying for a few seconds before the elbas:deploy task also can help with that.

dgarwood commented 4 years ago

@augustosamame I found that this was related to data not being written to disk before the snapshot used for the ami was created.

I found that a simple execute "sync" resolved the issue when added as a task and run before the elbas:deploy task, since it forces flushing all changes to disk.

twohlix commented 4 years ago

@dgarwood excellent info. Bigger question is would it be appropriate to bring something like that into capistrano-elbas or not? Seems like differing capistrano deployment environments may or may not have sync available to them (windows for example), also may be too heavy handed for a gem to be calling sync on people's behalf.

I should probably put something in the readme about this in the setup area.

dgarwood commented 4 years ago

@twohlix looks like there's something similar for windows: https://docs.microsoft.com/en-us/sysinternals/downloads/sync

I don't look at it from a "heavy handed" -ness perspective as much as "I want to pick up this tool and have it save me time" perspective. Anything that I have to troubleshoot to learn how it should be configured fails that test. First step towards winning that battle is good docs.

I think ideally I would want this to use the sync by default, since we already know that the no_reboot = true causes corrupt snapshots. That reduces barrier to entry for folks while we look for a better solution. Then we can create clear docs on what the options in the gem are, what they mean/do, and allows setting them differently if/when needed.

yashdave00 commented 2 years ago

We also tried putting a delay while and had to patch the gem a little bit for our use case. I would really like to know what causes this delay as the EBS snapshot is said to be incremental and not supposed to take up to 30-40 seconds for one extra release folder (around 15 MB).