aws / aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
https://aws.amazon.com/serverless/sam/
Apache License 2.0
6.51k stars 1.17k forks source link

Bug: Sam Build very slow and ignoring cache, recopying unchanged files and dependencies #4828

Open ahurlburt opened 1 year ago

ahurlburt commented 1 year ago

Description:

For context project is approx 40 lambdas, all python 3.8, using dependency layer for requirements.

Running sam build has become incredibly slow (10-15 min on my Macbook Pro) as project has grown and is really slowing down development process. Running sam build --parallel --debug --cached shows that most of the time seems to be taken during the Copying source file steps.

My lambda code is only about 1.5MB, the dependency layer is near 256MB. It appears that the Copy Source step is recopying the .aws-sam/deps/{path to dependency cache file} to the .aws-sam/build/DependencyLayer/{path} even when the dependency layer is unchanged.

This happens even if i just make a change to the template and no code changes, i still have to wait for this entire 10+ min build.

Steps to reproduce:

Here is the relevant bit of my template for the dependency layer

  PipDependencyLayer:
    Type: AWS::Serverless::LayerVersion
    Properties:
      ContentUri: dependencies
      CompatibleRuntimes:
        - python3.8
    Metadata:
      BuildMethod: python3.8

I have a dependencies folder in my working directory root (same level as my project root)

dependencies
  - python
  -  requirements.txt
project_root
  - lambda_handler_1.py
  - ... other project files

Requirements.txt content is below

-i https://pypi.org/simple
arabic-reshaper==2.1.4
asn1crypto==1.5.1
attrs==22.1.0; python_version >= '3.5'
aws-encryption-sdk==3.1.1
backports.zoneinfo==0.2.1; python_version < '3.9'
boto3==1.26.18; python_version >= '3.7'
botocore==1.29.18; python_version >= '3.7'
certifi==2022.9.24; python_version >= '3.6'
cffi==1.15.1
charset-normalizer==2.1.1; python_version >= '3.6'
click==8.1.3; python_version >= '3.7'
cryptography==38.0.4; python_version >= '3.6'
cssselect2==0.7.0; python_version >= '3.7'
future==0.18.2; python_version >= '2.6' and python_version not in '3.0, 3.1, 3.2, 3.3'
html5lib==1.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4'
idna==3.4; python_version >= '3.5'
jmespath==1.0.1; python_version >= '3.7'
lxml==4.9.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4'
numpy==1.23.5
oscrypto==1.3.0
pandas==1.5.2
pillow==9.3.0
pycparser==2.21
pyhanko==0.15.1
pyhanko-certvalidator==0.19.6
pyjwt==2.6.0
pypdf3==1.0.6
python-bidi==0.4.2
python-dateutil==2.8.2; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
python-http-client==3.3.7; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
python-liquid==1.4.7
pytz==2022.6
pytz-deprecation-shim==0.1.0.post0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4, 3.5'
pyyaml==6.0; python_version >= '3.6'
qrcode==7.3.1; python_version >= '3.6'
ratelimit==2.2.1
reportlab==3.6.12; python_version >= '3.7' and python_version < '4'
requests==2.28.1
s3transfer==0.6.0; python_version >= '3.7'
sendgrid==6.9.7
simplejson==3.18.0
six==1.16.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
starkbank-ecdsa==2.2.0
stripe==5.0.0
svglib==1.4.1; python_version >= '3.7'
tinycss2==1.2.1; python_version >= '3.7'
tqdm==4.64.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
typing-extensions==4.4.0; python_version >= '3.7'
tzdata==2022.6; python_version >= '3.6'
tzlocal==4.2; python_version >= '3.6'
uritools==4.0.0; python_version ~= '3.7'
urllib3==1.26.13; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4, 3.5'
webencodings==0.5.1
wrapt==1.14.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4'
xhtml2pdf==0.2.8

Observed result:

Described in the description, the build is very slow

Expected result:

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

output sam --info

{
  "version": "1.76.0",
  "system": {
    "python": "3.8.16",
    "os": "macOS-12.5.1-x86_64-i386-64bit"
  },
  "additional_dependencies": {
    "docker_engine": "Not available",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  }
}
mndeveci commented 1 year ago

Thanks for raising this issue!

Today sam build deletes .aws-sam/build folder before starting a build, so that it will always start with a clean state. For that reason, it copies previous built from either cache or deps folder to create built artifacts from scratch, and that is causing the delay for your use case.

I think we can do some improvements here (like not removing folder for functions/layers, which hasn't changed since last build if you run with --cached flag).

I will take this issue and discuss with the team.

ashismo commented 1 year ago

Using SAM to deploy on docker is painfully slow. It is wasting my days. Why every SAM build will copy all files (from cache) instead of changed files? Windows mounts 95% of time after 5-7 minutes of waiting but mac fails mounting 95% of the time and passes only 5%. I wanted to test LAMDA communication between two while running on docker but I spend (waste) most of my time figuring out what's going wrong in build and sam local start-api phase. Can you please look into the issue?

ahurlburt commented 1 year ago

Using SAM to deploy on docker is painfully slow. It is wasting my days. Why every SAM build will copy all files (from cache) instead of changed files? Windows mounts 95% of time after 5-7 minutes of waiting but mac fails mounting 95% of the time and passes only 5%. I wanted to test LAMDA communication between two while running on docker but I spend (waste) most of my time figuring out what's going wrong in build and sam local start-api phase. Can you please look into the issue?

Upvote the original issue if you want it to get more traction. I agree that the copy source is really slowing down development

ahurlburt commented 1 year ago

Thanks for raising this issue!

Today sam build deletes .aws-sam/build folder before starting a build, so that it will always start with a clean state. For that reason, it copies previous built from either cache or deps folder to create built artifacts from scratch, and that is causing the delay for your use case.

I think we can do some improvements here (like not removing folder for functions/layers, which hasn't changed since last build if you run with --cached flag).

I will take this issue and discuss with the team.

Thank you! Any idea the likelihood of this getting released in near future? This issue is really slowing down dev on a large project.

ahurlburt commented 1 year ago

Hi @mndeveci any updates on this? This is really causing a lot of problems and may have to abandon sam build because of it.

@praneetap - is this something your team is aware of? Anyway to get this on backlog

sam build takes 5-10 min for me every change.

ashismo commented 1 year ago

Hi @mndeveci any updates on this? This is really causing a lot of problems and may have to abandon sam build because of it.

@praneetap - is this something your team is aware of? Anyway to get this on backlog

sam build takes 5-10 min for me every change.

Same here with me. I reported this as well. No action has been taken so far. This should be given priority because it wastes many hours of effort, energy and enthusiasm of developers. This is the basics of any build e.g. maven gradle etc. SAM build should follow them. Build should be faster until we add new dependencies.

garrettks commented 1 year ago

I agree this is a very annoying issue. What is the purpose of a cache if it doesn't actually cache!?

AffiTheCreator commented 1 year ago

In my case it can take up to 50 min sometimes, it has become impossible to any productive work. I have only 6 lambdas and 3 dependencies in my package.json and 1 of them is aws-sdk.

The worst part is I'm running a sam app from aws aws-marketplace-serverless-saas-integration with some minor modifications.

Currently, I'm not using a layer for dependencies since my zip file has a total of 10MB (i will implement the layer eventually) but can we all agree that 50 min to build 6 lambdas is way to much. Clear evidence that there is a problem. I'm considering other solutions to manage my app.

I have tried--cached , I tried --parallel , I tried both. So far I have seen no improvement, if anything it took longer with the --parallel option.

Please Advice @mndeveci

UPDATE

With further development inside the lambdas things started to get worse, 2h+ for a simple build. I had enough and with some help I managed to isolate the issue. So here is what I found, it might help someone or further help isolate this problem.

Environment

This is where the issue starts, at least for me. I was running Sam commands from a wsl2 environment, adding to the problem I was working under /mnt/c/.... drive inside the wsl2 – When you launch a WSL you can access your files in the C: drive from inside the WSL by going to this directory.

All my codebase is in my C: drive, so far this method has worked for me. I do have an issue in docker volumes, but this is not for here !! , but I have build scripts that run inside this wsl2 while still interacting with the /mnt/c/..... directory

Anyway, I asked a friend to pull the git project I was running and try a build on his side. To my surprise, he took 5s (FIVE freaking SECONDS) to build the damm project.

He has a MacBook, it's using the same SAM version and AWS CLI the only difference is the OS and how he is accessing the files – natively.

Solution – Run natively

I decided to install SAM , AWS CLI and node (I'm using Node.js for the lambda runtime) natively and stop the WSL workflow. Guess what, it took now it takes me 1 min to build the project. Still not the 5s of the MacBook, but I also have more dependencies, although I can live with this values for now.

WHY ?

I think it's related to how Sam is using the mounted volume of the C drive and probably is not being optimized for WSL. But I don't know why, where nor how.

Summary

if you're having this issue in WSL you most likely have a similar setup , ditch the wsl when building your SAM app

mrpetem commented 1 year ago

I have to also say this is a horrific experience. I have spent the past several weeks trying to get a fully functional deployment and because it has over 100 endpoints with 30+ parameters, it is making me regret listening to the advice of AWS to adopt their serverless deployment approach.

I did deploy smaller projects fine before, but you only hit this problem with a larger project after coming out of local development to do a staging/production deployment.

Also not great that local and build deploy use different parameter configuration files. It's just a mess.

SKLC1 commented 11 months ago

Hi @mndeveci any updates on this? This is really causing a lot of problems and may have to abandon sam build because of it.

@praneetap - is this something your team is aware of? Anyway to get this on backlog

sam build takes 5-10 min for me every change.

Same here, very frusterating

mndeveci commented 11 months ago

With v1.104.0 we've released build-in-source feature which eliminates some of the copy operations (it builds inside the source folder and symlinks to target folder in the end). As of now it only supports nodejs runtimes and functions which uses Makefile builds.

You can enable build in source by providing --build-in-source option, see documentation for more details: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-build.html

Please let us know if it improves the performance of your builds, and if you find any issues please create a new GH issue in this repository.

Thanks!

ahurlburt commented 11 months ago

@mndeveci why should a new GH issue be created? Even if build-in-source improves performance for node not everyone uses node and sam is supposed to support multiple platforms. This issue has a lot of history, upvotes and complaints that should be used to prioritize a fix for all platforms.

mndeveci commented 11 months ago

@ahurlburt I meant creating new issue if anyone faces any problem with build in source feature.

ahurlburt commented 11 months ago

@ahurlburt I meant creating new issue if anyone faces any problem with build in source feature.

Ah got it, apologies i thought you were planning on closing this issue. Thanks for clarifying.

AlfredFu commented 8 months ago

Encountered same issue few weeks ago, and the workaround I used to fix this issue is remove requirements.txt, and package an aws lambda layer, define the function relay on the layer packaged.

siosphere commented 5 months ago

I'm running into this, when I need to use an Image package w/custom Dockerfile build, it is extremely slow to build locally on Macbook.

I can manually build the Dockerfile and it takes ~30s. When it is build as part of sam, it sits there for ~5-10min, it's incredibly annoying.

gspeare commented 1 month ago

Using python, so build-in-source is not applicable. 40-50 minute builds, all CopySource...