alestic / ec2-consistent-snapshot

[SUNSET] Initiate consistent EBS snapshots in Amazon EC2
http://alestic.com/2009/09/ec2-consistent-snapshot
Other
442 stars 111 forks source link

Support for AWS nvme devices in volume discovery #98

Closed nikolai-derzhak-distillery closed 4 years ago

nikolai-derzhak-distillery commented 6 years ago

EBS discovery fails on m5.large with error below.

I suppose it can be we need to add support for /dev/nvme* block device names here:

https://github.com/alestic/ec2-consistent-snapshot/blob/master/ec2-consistent-snapshot#L527-L529

Once I figure out fix in code can I propose PR ? I never coded in perl though :)

So I most probably hardcode it.

ec2-consistent-snapshot.bin: ERROR: Unable to determine volume id for device nvme1n1 in mount /dat
nikolai-derzhak-distillery commented 6 years ago

Seems with NVME block device names can be anything and will be just converted to nvme${i}n${j}.,

Just like order of devices will define indexes i and j there. Hmmm.

ec2-consistent-snapshot.bin: Thu May 17 23:42:25 2018: Found EBS block devices for i-060ba628ed66c0ff9: 
ec2-consistent-snapshot.bin: Thu May 17 23:42:25 2018:     vol-0c603032c23d6ce66 /dev/sda1
ec2-consistent-snapshot.bin: Thu May 17 23:42:25 2018:     vol-0f7b406482deeec9e /dev/sdm
ec2-consistent-snapshot.bin: Thu May 17 23:42:25 2018:     vol-0b68c200e6040046c /dev/sdn
ec2-consistent-snapshot.bin: Thu May 17 23:42:25 2018: Seems to be a regular block device
nikolai-derzhak-distillery commented 6 years ago

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nvme-ebs-volumes.html#identify-nvme-ebs-device

lsblk could make it

# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nvme1n1     259:0    0   30G  0 disk /data
nikolai-derzhak-distillery commented 6 years ago
# we can not freeze FS in container (unless it priviliged which is unsecure)
# but that is fine as we use mongodb fsyncLock call and it makes sure data is consistent since v3.2
# https://docs.mongodb.com/manual/tutorial/backup-with-filesystem-snapshots/
# 
#  so we use --no-freeze-filesystem option and point it to /etc/hostname
# in containers it gives snapshot script some valid device (mount check) 
# which we replace with patch anyway

/ec2-consistent-snapshot.bin --use-iam-role --no-freeze-filesystem /etc/hostname --region=us-
west-2 --debug --signature-version 4 —noaction

  for my $device (@devices) {
      my ($ec2_device, $ec2_device_2, $ec2_device_3, $volume_id);
      $ec2_device = $ec2_device_2 = $ec2_device_3 = "/dev/$device";

      warn "$ec2_device $ec2_device_2 $ec2_device_3";

      # ugly hardcode for device name we always use as it hard to map nvme devices by name
      $ec2_device = "/dev/sdm";    
nikolai-derzhak-distillery commented 6 years ago

We obviously lack docker container support and new AWS nvme devices support here.

markstos commented 6 years ago

A pull request is welcome for NVME support. Please open a separate issue if something needs to be updated related to Docker.

tavisma commented 6 years ago

I have a patch which enables NVME support, will clean it up a bit and submit

tavisma commented 6 years ago

Its quick and dirty, but works reliably in all of our environments https://github.com/alestic/ec2-consistent-snapshot/pull/100

nikolai-derzhak-distillery commented 6 years ago

Looks good ! Just needs nve tool to be installed.

nikolai-derzhak-distillery commented 6 years ago

I will give it a try this weekend as I am going to install couple backups around.

macropin commented 5 years ago

@tavisma this does not seem to handle NVME backed LVM filesystems.


ec2-consistent-snapshot: Tue Oct 16 02:35:01 2018: Seems to be a regular block device
ec2-consistent-snapshot: ERROR: Unable to determine volume id for device dm-0 in mount /mnt/data00```
tavisma commented 5 years ago

hmm, we use LVM on all our systems without issue Can you attach a debug log of the run?

tavisma commented 5 years ago

i may have missed a check for nvme device types, i'll update my PR tomorrow

tavisma commented 5 years ago

This is quick and dirty, i'll clean it up tomorrow but see if this works for now patch.txt

tavisma commented 5 years ago

merged a more complete patch in my PR https://github.com/alestic/ec2-consistent-snapshot/pull/100

macropin commented 5 years ago

@tavisma thanks. PR Working now for me with lvm volumes. However I have another issue now with Btrfs volumes not working

+ /opt/bin/ec2-consistent-snapshot --debug --region ap-southeast-2 --use-iam-role --freeze-filesystem /exports/nfs00
ec2-consistent-snapshot: Authenticating with IAM role
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: No volume ids specified; discovering volume ids
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: Discovering volume ids for: /exports/nfs00
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: Determining instance id
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: create EC2 object
ec2-consistent-snapshot: Endpoint: https://ec2.ap-southeast-2.amazonaws.com
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: Fetching instance description for i-07232556dbfcf69bb
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: Found EBS block devices for i-07232556dbfcf69bb: 
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018:     vol-0aab33ee6ed28422a /dev/sda1
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018:     vol-04b422a61d8c2c3cc /dev/sdf
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018:     vol-0e2c0bb356bfc1160 /dev/sdg
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: Seems to be a regular block device
Use of uninitialized value $_[0] in substitution (s///) at /usr/share/perl5/File/Basename.pm line 341.
fileparse(): need a valid pathname at /opt/bin/ec2-consistent-snapshot line 515.
ec2-consistent-snapshot: Thu Oct 18 10:30:38 2018: done
macropin commented 5 years ago

Why is LVM device discovery limited to xvd[a-f] ? https://github.com/alestic/ec2-consistent-snapshot/pull/100/files#diff-89297657477d72aef45312d550a3c63bR508

ajjl commented 5 years ago

@marcopin related issue: https://github.com/alestic/ec2-consistent-snapshot/issues/108

markstos commented 5 years ago

@marcopin, @ajjl The project long pre-dates the existence of NVMe. Yes, fixing the Regex might fix it.

This project is no longer being maintained, though. The primary author finds that modern filesystems provide enough crash-resistence to not require a "consistent" snapshot. As co-maintainer, I tend to agree, but still like the idea of taking consistent snapshots for extra piece of mind. However, AWS does not support the Perl stack this is built-on, and we've had related debugging and packaging problems, making this project expensive to maintain for marginal benefit.

I created a version in bash that's inspired by this, but is simpler and uses a supported stack. One day I may also follow in @ehammond footsteps and decide that taking 'consistent' snapshots is no longer necessary.

https://github.com/RideAmigosCorp/ec2-consistent-snapshot.sh

markstos commented 5 years ago

@ehammond Perhaps we should update the top of the README to note the support status has changed.

ehammond commented 5 years ago

@markstos Excellent idea.

markstos commented 5 years ago

@ehammond Here's my proposed note: https://github.com/alestic/ec2-consistent-snapshot/blob/master/README.md I meant to make a Pull Request, but it looks like I made a direct commit instead. Either way, it's easily revised.

markstos commented 5 years ago

@ehammond Here's perhaps a nail in the coffin of this project: AWS Backup is an official AWS backup solution. It shows no concern for requiring a tool like this. The service is essentially free, as the only cost is the cost of the snapshots. It handles expiration and rotation as well.

tavisma commented 5 years ago

We've moved away from this project and the rewrite towards the AWS Backup service

ehammond commented 4 years ago

Thanks for submitting this. Unfortunately, this project is no longer under development in this repo. Anybody is welcome to fork the project and continue development if there is interest.