aws-amplify / amplify-cli

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development.
Apache License 2.0
2.82k stars 820 forks source link

Python lambda is 2.6mb when there is 0 dependencies #5566

Open michaelbrewer opened 4 years ago

michaelbrewer commented 4 years ago

Describe the bug When create a function with a very simple Python lambda, the generated lambda is 2MB

Amplify CLI Version

4.30.0

To Reproduce

Take pointless lambda like this and build it.

def handler(event, context):
     print("event:", event)

Expected behavior The deployed lambda should only be a couple bytes not 2.6mb

Screenshots

image

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

yuth commented 4 years ago

Amplify CLI uses pipenv to bundle the resources to the cloud. Pipenv brings in virtual env, which adds size to the bundle.

michaelbrewer commented 4 years ago

@yuth
What is the benefit of this? Why not just use AWS Sam to handle the bundling of the lambda? even CDK does not inflate the lambda by 2.6mb with zero benefits.

And FYI, Amplify does not even support Python 3.8

edwardfoyle commented 4 years ago

@michaelbrewer You're right that pipenv is overkill for small functions that only have a couple dependencies, but it is good for managing functions with lots of dependencies. It is also a well adopted package management tool in the python ecosystem: https://packaging.python.org/tutorials/managing-dependencies/. We don't use AWS SAM because many of our customers don't use / have SAM installed.

We do support Python 3.8, so if you are having trouble there please open an issue with details of the problem.

michaelbrewer commented 4 years ago

@edwardfoyle -

amplify console does not support Python 3.8 https://github.com/aws-amplify/amplify-console/issues/595

Most lambdas don't need many dependencies (and some have none at all outside of AWS SDK)

So in most cases just have a requirements.txt should be all that you would need.

Larger projects can use poetry.

michaelbrewer commented 4 years ago

@edwardfoyle
i don't quite understand why the actual deployed lambda needs to be 2.6mb, these are not runtime dependencies but build time dependencies?

Here is the example lambda for python function for a cognito auth challenge:

def handler(event: dict, _):
    print("event", event)

    if len(event["request"]["session"]) == 0:
        event["request"]["challengeMetadata"] = "COOKIE_CHALLENGE"
        event["request"]["publicChallengeParameters"] = {}
        event["request"]["publicChallengeParameters"]["cookieName"] = "source"
        event["request"]["privateChallengeParameters"] = {}

    return event

I would expect the deployed lambda to be tiny

michaelbrewer commented 4 years ago

Note how AWS-CDK does this without bloating the lambda with unused libraries : https://github.com/aws/aws-cdk/tree/master/packages/%40aws-cdk/aws-lambda-python

As a develop you just need to have Docker installed.

michaelbrewer commented 3 years ago

@edwardfoyle @yuth - will anyone look into fixing this?

I can see that AWS does care about the cost of cold starts (https://aws.amazon.com/blogs/developer/modular-aws-sdk-for-javascript-release-candidate/), and this seems to be inconsistent.

kaustavghosh06 commented 3 years ago

@michaelbrewer We'll look into this. Sorry for the late response.

michaelbrewer commented 3 years ago

Maybe @heitorlessa has some input on how to support Python lambdas in a clean way. It would be nice to support SAM or do what CDK does when building Lambda functions (while keeping the size down)

michaelbrewer commented 3 years ago

Hopefully the tools improves soon. As we are planing to build all of our lambdas in Python

heitorlessa commented 3 years ago

@kaustavghosh06 while I don't have the bandwidth to contribute code I'm happy to review or chat

You can keep pipenv but only extract the dependencies themselves instead of bringing the whole virtual env - it's not necessary.

pipenv here will also break when customers bring dependencies that rely on C, as it needs to build within a Linux env -- hence Docker suggestion by Michael (a flag would do).

michaelbrewer commented 3 years ago

Thanks @heitorlessa for offering an ear for this.

I do like how sam can bootstrap a lambda with the build tools and sample code:

sam init --location https://github.com/aws-samples/cookiecutter-aws-sam-python

Maybe this can be an option @kaustavghosh06 . When can we expect to at least not bring the whole virtual env with the lambda?

michaelbrewer commented 3 years ago

@kaustavghosh06 what is the timeline on this? Otherwise amplify function build does not seem to work, it just hangs without completing.

michaelbrewer commented 3 years ago

RE: amplify function build failing

Oh i see that this is a separate issue (which is NOT being fixed):

michaelbrewer commented 3 years ago

@eddiekeller @kaustavghosh06 - is anyone looking into this? How easy would it be to fix this ourselves in the CLI?

michaelbrewer commented 3 years ago

How about doing something along the lines of :

# Create an requirements.txt in src/ (which can be gitignored)
pipenv lock -r > src/requirements.txt

Then build/download the dependencies using docker image lambci/lambda:build-python3.8

# Download the vendored deps
pip install -r requirements.txt -t /vendored && cp -au . /vendored
samjett247 commented 3 years ago

Found a hack to make amplify stop building all these nondependent-dependencies into my Python functions. This will only work for functions that do not have any dependencies that aren't already provided in lambda layers, as it just completely stops the function build process. There is an amplify.state file inside of each function. If you don't want amplify to build your function, modify the amplify.state file to make amplify "think" its a nodejs function.

amplify.state (original, will build virtualenv into the deployed package)

{
  "pluginId": "amplify-python-function-runtime-provider",
  "functionRuntime": "python",
  "useLegacyBuild": false,
  "defaultEditorFile": "src/index.py"
}

amplify.state (new - will NOT build any python packages but otherwise the Lambda function will work the same in AWS)

{
  "pluginId": "amplify-nodejs-function-runtime-provider",
  "functionRuntime": "nodejs",
  "useLegacyBuild": false,
  "defaultEditorFile": "src/index.py"
}
tmirun commented 2 years ago

Same Issue Here

joekiller commented 2 years ago

There is a comment on an alternative way to run pipenv in the following comment which allows installation of dependencies without copying the virtualenv. Ideally the amplify project may use this method instead:

https://github.com/pypa/pipenv/issues/746#issuecomment-416475131

speedhawk21 commented 1 year ago

Amplify is still including the pipenv venv bundle in function deployments? Is this a joke?