aws / chalice

Python Serverless Microframework for AWS
Apache License 2.0
10.67k stars 1.01k forks source link

Deploying with layers and dependents over-provisions packages in .zip file #1208

Open markalexander opened 5 years ago

markalexander commented 5 years ago

I'm trying to deploy with the official AWS SciPy layer, but another of the packages in my requirements.txt has scipy as a dependency. This seems to cause SciPy to get packaged up in the final .zip file, negating the point of using the layer in the first place (and pushing me over the size limit).


Reproduction steps:

app.py:

from chalice import Chalice
import scipy

app = Chalice(app_name='layer-test')

@app.route('/')
def index():
    return {'version': scipy.__version__}

.chalice/config.json:

{
  "version": "2.0",
  "app_name": "layer-test",
  "stages": {
    "prod": {
      "api_gateway_stage": "api",
      "lambda_functions": {
        "api_handler": {
          "layers": [
            "arn:aws:lambda:eu-west-1:399891621064:layer:AWSLambda-Python36-SciPy1x:2"
          ]
        }
      }
    }
  }
}

(note you can also use arn:aws:lambda:us-east-1:668099181075:layer:AWSLambda-Python36-SciPy1x:2 for the same layer, if you prefer).

Leave requirements.txt blank and run:

chalice package --stage prod /tmp/packaged

If you check the .zip generated, you can see that scipy is not in there (as expected).

However, if we add e.g. scikit-criteria to requirements.txt and then run chalice package again, the .zip now contains the (quite large) SciPy package :(


I suppose I can work around this by making my own layer for scikit-criteria, but is there any way to fix it in Chalice? As far as I know, requirements.txt doesn't support any kind of 'no dependencies' flag, so maybe it's not so simple.

stealthycoin commented 5 years ago

Ah interesting. I guess in a magical ideal world it would detect that it isn't needed and not include it. I'm not sure if thats possible though since we won't actually be able to inspect what is in the layer until runtime, so finding whats in it and what should be omitted is probably not feasible.

Another option would be having a set of known layers that have certain deps in them to omit, that relies on us keeping an updated list though and isn't really forwards or backwards compatible if anything changes.

The only reasonable solution I can think of would be to allow the user to control it with a "trust me this is going to be in the runtime list" that you can provide either through configuration or command line argument. I imagine it would need to have the same resolution as layers, that way you can specify it alongside layers for example:

{
  "version": "2.0",
  "app_name": "layer-test",
  "stages": {
    "prod": {
      "api_gateway_stage": "api",
      "lambda_functions": {
        "api_handler": {
          "layers": [
            "arn:aws:lambda:eu-west-1:399891621064:layer:AWSLambda-Python36-SciPy1x:2"
          ],
          "ignore_deps": ["scipy"]
        }
      }
    }
  }
}

Any better ideas?

markalexander commented 5 years ago

I think it makes sense for it to be at the same level as layers as long as they're not coupled to them, because they're not intrinsically linked. E.g. the Lambda runtime comes with boto(core/3) by default, no layers involved. It would be nice to minimize deploys by ditching that too.

So what you suggested I think is fine, but not:

{
  ...
  "layers": [
    {
      "arn": "arn:aws:lambda:eu-west-1:399891621064:layer:AWSLambda-Python36-SciPy1x:2",
      "satisfies": [
        "scipy"
      ]
    }
  ]
  ...
}

(which would be a breaking change anyway, but you get the idea)

kapilt commented 5 years ago

If you want to have deps not packaged by chalice just put them in a separate requirements file would be the easiest resolution, and then configure the additional layer in config.json

markalexander commented 4 years ago

@kapilt I'm not sure this works. In this case, scipy is a dependency of scikit-criteria. So removing scipy to another requirements file (because I have a scipy layer) won't stop it from being packaged, because scikit-criteria will install it and cause it to get packaged up regardless.

Please let me know if I'm misunderstanding; my current workaround is to just remove the scipy folder from the resulting .zip via a command in my CI/CD pipeline.

y0m1g commented 4 years ago

Hello, any update on this issue? I find myself in a similar situation (having a dependency in my requirements.txt that needs NumPy+SciPy, which then get packaged with my app, defeating the purpose of having a layer with those 2 preinstalled). 😕