rackerlabs / lambda-uploader

Helps package and upload Python lambda functions to AWS
Apache License 2.0
271 stars 56 forks source link

Allow ignoring of library files too #112

Open declension opened 7 years ago

declension commented 7 years ago

First, thanks for a great project, been very useful.

Proposal

For small lambdas in particular, it would be useful to either:

Naturally this should be optional as virtualenv is necessary if you're not running on the standard Amazon setup, etc.

Example

A small python lambda function, consisting of a few source files was uploading very quickly.

By adding a single pip requirement (also, as it happens, very small), and rebuilding, the size of the upload suddenly shot up by ~1000% (many megabytes) - on further examination this was due to the inclusion in the zip of:

Thanks!

martinb3 commented 7 years ago

@declension I'm surprised with all the items you're listing as having been included in your zip file. How is, e.g. Python itself, getting into the virtualenv's site-package directory? Maybe you could show us an example?

(I should note that you can also always skip the automatic virtualenv and maintain the necessary dependency files yourself, to control exactly what goes into the zip file.)

declension commented 7 years ago

Thanks @martinb3. Perhaps It's a misconfiguration on my part (trying to find an older example now) - yes the Python itself was a mistake (will update above), sorry.

And agree, could go for manual maintenance of dependencies, but I like pip a lot :smile: ... so I guess it'd be nice to keep using this whilst shedding upload weight especially on my slow connection.

Either way I patched in my fork and it's proving useful (for me)...

declension commented 7 years ago

To follow up (and I may still be doing something wrong), I recreated a cut-down version of that project (no virtualenv), with one dependency (pylms FWIW) and ran:

$ lambda-uploader --no-upload --no-virtualenv

Here is a listing of the lambda_function.zip contents.

HTH

martinb3 commented 7 years ago

@declension Could you also share your lambda.json? I'm surprised you need packages like pip and wheel, but perhaps there's a dependency chain somewhere that doesn't make sense?

declension commented 7 years ago

@martinb3 interesting, thanks. Maybe PyLMS itself is the problem then - I guess those entries are not normally there?

Maybe I could try a test with a "less unusual" package, but meanwhile here's the latest (anonymised) lambda.conf:

{
  "name": "lambda-tester",
  "description": "Test for lambda-uploader",
  "region": "eu-west-1",
  "handler": "handler.lambda_handler",
  "role": "arn:aws:iam::900000000000:role/service-role/lambda-uploader-test",
  "ignore": [
    ".git",
    ".idea/",
    "metadata/"
  ],
  "timeout": 7,
  "memory": 128
}
martinb3 commented 7 years ago

@declension given what you've told me so far, I tried to reproduce what you're seeing by creating a lambda.json file with your snippet above:

$ cat requirements.txt
pylms

$ cat lambda.json
{
  "name": "lambda-tester",
  "description": "Test for lambda-uploader",
  "region": "eu-west-1",
  "handler": "handler.lambda_handler",
  "role": "arn:aws:iam::900000000000:role/service-role/lambda-uploader-test",
  "ignore": [
    ".git",
    ".idea/",
    "metadata/"
  ],
  "timeout": 7,
  "memory": 128
}

$ lambda-uploader --no-upload -c lambda.json
λ Building Package
λ Fin

$ du -sh lambda_function.zip
3.3M    lambda_function.zip

As you can see, the resulting zip file is only 3mb. Do you have a fully fledged example we could clone from Github and try to reproduce? Otherwise I'm unable to reproduce what you're seeing.

declension commented 7 years ago

@martinb3 - great, that's actually exactly what I'm seeing too - a 3.3MB ZIP (~9MB uncompressed).

The trimmed one I was using came in at just a few KB (uncompressed: ~28k of pylms and a few more for the test source itself), as it didn't need any python 2.7, pip, setuptools etc to work on Lambda.

martinb3 commented 7 years ago

@declension I didn't realize this at first, but virtualenv itself requires those packages apparently.

Sometimes people do depend on those, so I'm not sure there's an obvious fix to filter those out (without breaking others' use of this tool). --no-site-packages still keeps those few there.

@jarosser06 thoughts? Or should @declension just ignore them manually if he doesn't want them?

jarosser06 commented 7 years ago

We just ignore the package size since a few megabytes isn't a big issue for us. It is a pretty trivial change to use the ignore on the site-packages copy as well however I want to consider the potential problems before making a PR.

If the short term goal is to just have pylms packaged and you have no other need for anything else, then I guess you can call lambda-uploader with --no-virtualenv and then add pylms using the extra files flag (-x). This should give you your basic lambda package with the pylms library in it.

For Example:

lambda-uploader --no-virtualenv -x ~/.virtualenvs/<virtenv>/lib/python2.7/site-packages/pylms
declension commented 7 years ago

Thanks; my short-term goal was fixed a while ago (I patched my fork locally and am using that).

Especially given virtualenv's behaviour and pip's size I assumed this would be useful for other users. Interestingly, I notice the exact change (other than previous separate changes I did for #114) I did already sits on SimpleHQ's fork...

redblacktree commented 6 years ago

+1

AWS Lambda size limits can become a problem pretty quickly. For the vast majority of uses, pip, setup_tools, and wheel are going to be cruft. My contention is that if you lambda function requires any of those, (why?) you should explicitly include them in requirements.txt or lambda.conf.