aws-amplify / amplify-cli

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development.
Apache License 2.0
2.81k stars 821 forks source link

Pushing lambda with added lambda layer can incorrectly exceed lambda size quota #7559

Open chriskinzel opened 3 years ago

chriskinzel commented 3 years ago

Before opening, please confirm:

How did you install the Amplify CLI?

npm

If applicable, what version of Node.js are you using?

v14.8.0

Amplify CLI Version

5.0.1

What operating system are you using?

Mac

Amplify Categories

function

Amplify Commands

push

Describe the bug

I'm adding a lambda layer to an existing lambda function. The lambda layer simply references a single large npm package shared among our lambdas, that was previously part of the lambda package.json. The lambda layer and lambda do not exceed quotas described in https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html. When running amplify push the following error occurs:

Resource Name: EssDynamoStreamFn (AWS::Lambda::Function)
Event Type: update
Reason: Resource handler returned message: "Function code combined with layers exceeds the maximum allowed size of 262144000 bytes. The actual size is 347844280 bytes. (Service: Lambda, Status Code: 400, Request ID: d84d70a1-4881-4c52-9e0d-39b95da556c3, Extended Request ID: null)" (RequestToken: b8cd564e-945e-eb9a-feb0-8507814343a7, HandlerErrorCode: InvalidRequest)

I believe this occurs because the combined size of the lambda layer + size of the old lambda (with package installed, before it was removed) exceeds quota limits. This is a bug because the lambda no longer references the package that was moved to the lambda layer, and node_modules was cleared.

As a workaround, I removed the lambda layer from the lambda and ran amplify push to push a version of the lambda without the installed package. Then I added back the lambda layer and ran amplify push again, and the push succeeds.

Expected behavior

amplify push should correctly update the lambda function without error if quota size is not exceeded in new lambda + lambda layer.

Reproduction steps

  1. Create lambda function with large npm dependencies that does not exceed quota limit, but unpacked size is at least half of quota limit
  2. Push lambda function
  3. Create lambda layer and move all dependencies from lambda function to lambda layer (dependencies in the lambda function package.json should be removed and node_modules deleted)
  4. Run amplify push

GraphQL schema(s)

```graphql # Put schemas below this line ```

Log output

``` # Put your logs below this line ```

Additional information

No response

hisham commented 3 years ago

+1. This is very annoying as the only way to fix currently is to push lambda with no package then do 2nd push with lambda attached to layer, causing downtime in prod (or find a way to trim packages without breaking stuff), and also we have to do this multi-phased deployment for each env in a multienv environment which is really annoying.

attilah commented 3 years ago

@chriskinzel @hisham Unfortunately that error is coming from the service, not generated by the CLI.

In V5 of CLI layers are went through a big overhaul, my suspicion is that the layer was associated with the function with a previous V4 CLI? In that case when you push an update to the layer and creating a new layer version you have to run update function to point it to the new version, which should succeed because you made the package.json modification and this must have been 2 separate push operations.

In V5 we added an option to always use the latest version of a layer, which would enable such cases to be solved in 1 push.

chriskinzel commented 3 years ago

Hi @attilah, no the layer was only added once we upgraded to V5. We are able to reproduce by deleting the layer, adding back the package to the lambda package.json, pushing, and then creating a new layer and making the package.json changes as described in the original comment (move dependency from lambda package.json to layer package.json) all using CLI V5. We also get this issue when switching environments and attempting a push to update the environment (from "fixed" environment).

Unless you are saying that lambda functions created with < CLI V5 would have this issue?

attilah commented 3 years ago

@chriskinzel Is it a public npm package that we could use for repro?

hisham commented 3 years ago

(@chriskinzel and I are on same team)

No it's not public npm package, but if you want access to it I can send auth token to the amplify cli team email.

You can also probably make your own version of it. What makes the package a little big is its dependencies - not our own code. I just sent the package's dependencies in package.json to amplify-cli@amazon.com. But if you need access to the package itself, I could share auth token possibly.

josefaidt commented 3 years ago

Hey @hisham and @chriskinzel are y'all still experiencing this issue? It appears y'all have identified a workaround despite it being a tedious one. Given the constraints from the service, I'll mark this as a feature request for the team to review further.

hisham commented 3 years ago

Yes it's still an issue. The workaround involved causes system downtime (since we have to remove the big package from lambda and so critical code can't run) so we haven't done it for critical Lambdas where we don't want any downtime.

mewtlu commented 1 year ago

Sorry to necro an old issue but any chance of a resolution/more reliable workaround for this issue? Currently it seems the only way to resolve it requires manual intervention in each environment as well as downtime of anything that depends on the layer which doesn't feel like a viable solution.

jeffski commented 4 months ago

Have run in to this a couple of times now and wanted to share the workaround that worked for us.

We are deploying using the Serverless Framework and essentially what we do is rename the Lambda in the config file. This creates a brand new Lambda, instead of trying to modify the existing Lambda. We are running the Lambda in a Step Function so we update that to use the new Lambda name.

This all seems to work although with minor disruption while the changeover happens, I think due to the way things align in Step Functions. It is preferable to removing and re-adding layers or doing a remove/redeploy as our deployments take several minutes and would result in considerable down time.

Anyway, this might be an option for anyone in this situation and might work with other triggers.