alestic / ec2-expire-snapshots

Delete expired EBS snapshots in Amazon EC2. Install on Ubuntu with: sudo add-apt-repository -y ppa:alestic && sudo apt-get update && sudo apt-get install -y ec2-expire-snapshots
Other
107 stars 36 forks source link

wish: AWS Lambda compatibility #26

Closed markstos closed 8 years ago

markstos commented 8 years ago

Now that AWS Lambda exists and has the Scheduled Events trigger, it seems like a natural fit for running a "cron job" that expires snapshots. I see a few possible ways to do this:

I need a solution soon expire EBS snapshots, so I may embark on these paths soon. @ehammond do you have an recommendation or advance on the best way to expire snapshots these days before I spend time on one of these projects? My intent is to publish the result as open source.

Sure, I would put ec2-expire-snapshots on a t1.micro and find a way to spin it up periodically and run a cron job like this, but I loathe the idea of supporting yet-another full OS install, even if it's rarely going to be powered on.

mpdude commented 8 years ago

Can't you somehow wrap-up ec2-expire-snapshots directly to run under Lambda? I mean, Perl is one of the supported languages...

markstos commented 8 years ago

@mpdude I don't think Perl is one of the officially supported languages, but @ehammond reports that it /is/ installed in the environment. It could be possible to create a Lambda .zip file that contains the entire Perl dependency chain needed for ec2-expire-snapshots . I believe a thin wrapper in Node.js, Python or Java would be required to kick off the execution and pass arguments to it. That's one of the options I'm considering.

The downside to that plan is that Perl's unofficial status could mean that it's dropped later.

The upside is that it might be fast to implement for now.

mpdude commented 8 years ago

Oops, disregard my stupid comment please – I mixed up Perl and Python (shame on me!)

ehammond commented 8 years ago

Yep, you can run Perl commands from node in AWS Lambda.

ehammond commented 8 years ago

...but, since this isn't a feature that will be supported inside this software project directly, I'm going to close the issue. Feel free to add a note pointing to the discussion or new project elsewhere.

markstos commented 8 years ago

I'm following up my notes on exploring using AWS Lambda for snapshot expiration.

First, I tried to packaged up ec2-expire-snapshots as a Lambda function, despite it not being officially supported. This involved packaging up every required Perl dependency into a .zip file. I was able to gather all the dependencies, but the result included over 3,000 files, or about 48 MB. Some of them included compiled ".xs" extensions that could be specific to the architecture and Perl version.

Then I realized I'd build this with Perl 5.18 on my laptop, while the Lambda environment has Perl 5.16 right now.

Then I gave up. With the lack of officially support, it just didn't seem worthwhile to continue to try wrangle over 3,000 files as things inevitably change over time.

So I switched to looking at a solution that was native in Node.js. I found the brand new rotate-snapshot which is built to be a Lambda function that does this. Becaus AWS provides the "aws-sdk" package as part of the environment, merely 190 files were needed in the Node.js package, and the resulting .zip file was only about 100k (compressed).

There were significant gaps in the documentations and no complex retention rules are supported yet. I submitted pull requests for a number improvements, but better retention rule support is a work in progress.

Having considered both approaches, the lighter weight, Lambda-centric approach of rotate-snapshot is the one I'll be forward. I hope someone ports the more robust retention rules from this project to that one.

ehammond commented 8 years ago

Thanks for following up. I wasn't able to find any snapshot expiration tool that supported the flexible rules I needed which is why I write ec2-expire-snapshots. There is a lot of code reuse in Perl, so I'm not surprised it pulls in a lot of modules. Most of them would not be required because of features not used, but selective requirements based on usage is a tough problem to solve.

markstos commented 8 years ago

I think having 3,000 files in a dependency tree is reasonable in the current day and age. I just checked our production Node.js has over 14,000 files in the node_modules directory.

The more vexing issue here is the presence of compiled .xs code, which creates binaries tied to a particular Perl version on a particular architecture, adding considerable extra time to compile precisely upfront, and then maintain over time as AWS changes the provided Perl and Linux versions. Considering the support for Perl is not official, I didn't want to go down that path.

Sometimes there are flags you can set to produce Pure-Perl versions of modules that also have ".xs" versions, but that's also time-consuming to figure out and not always possible.

ec2-expire-snapshots is a good fit for something like a Ubuntu PPA which is made for distributing compiled code, but for AWS Lambda, it's not such a good fit due to the .xs code in the dependency tree and the current lack of of official Perl support.

markstos commented 8 years ago

I took a look this again today and found that I might have been up in the weeds trying to bundle all the dependencies to ec2-expire-snapshots. AWS is clear that they are running the Node.js code in an Amazon Linux container, which in turn publishes all the packages that are installed. The list not only includes Perl, but all the dependencies of ec2-expires-snapshots as well:

https://aws.amazon.com/amazon-linux-ami/2016.09-packages/

markstos commented 8 years ago

Following up on my latest attempt to run ec2-expire-snapshots via Lambda: I found that the "Date::Manip" module wasn't actually there. That finding is consistent with the review of the Lambda environment that @ehammond did in 2014 where he posted a list of every file on the filesystem.

While it would be possible bundle and ship an entire Perl dependency tree, it also strikes me as painful to maintain. I'm returning my focus to finding a solution that works natively in Python or Node.js, since those are the languages that have official AWS support for Lambda now.

markstos commented 8 years ago

I made project on having having similar functionality to this project available in AWS Lambda by releasing this Node.js module which implements the backup rotation algorithm:

https://www.npmjs.com/package/grandfatherson

It still needs to be integrated with Lambda snapshot expiration logic. This project might be a good one to merge it with:

https://github.com/szkkentaro/rotate-snapshot

markstos commented 8 years ago

I've now published my Lambda-based snapshot expiration tool:

https://github.com/RideAmigosCorp/lambda-expire-snapshots