aws / aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
https://aws.amazon.com/serverless/sam/
Apache License 2.0
6.51k stars 1.17k forks source link

`sam deploy` for deduped build checks presence of identical build artifact many times #4053

Open yskkin opened 2 years ago

yskkin commented 2 years ago

Describe your idea/feature/enhancement

I sam deploy-ed 17 lambdas which shares CodeUri and Runtime (thus deduped build is enabled). Got following output:

Uploading to my-stack-name/eb7b8e83[...snip...]  17287115 / 17287115  (100.00%)
File with same data already exists at my-stack-name/eb7b8e83[...snip...], skipping upload
[... another identical 15 lines ...]

This output implies sam deploy unnecessarily uploads identical artifact.

Proposal

Ignore identical upload. It takes few seconds for each skipped upload. Ignoring identical upload can improve performance of sam deploy.

Additional Details

jfuss commented 2 years ago

@yskkin Not sure I follow what you are proposing here. The output states "skipping upload" so not sure what is indicating unnecessary uploads.

yskkin commented 2 years ago

https://github.com/aws/aws-sam-cli/blob/bfbcc166709dba330310088e5a5e520da1ef2444/samcli/lib/package/s3_uploader.py#L87-L89 This file_exists check take few seconds for each lambda and slows down build if deployed deduped lambda > ~10. Since we know build artifact is identical, I thought even file_exists check can be skipped.

zfael commented 2 years ago
image

Currently facing the same issue, SAM stack is taking sooooo long to get deployed, and I'm seeing a bunch of these messages being triggered.

Maybe these checks could be done in parallel or something similar?

jfuss commented 2 years ago

So currently we check against the source (s3) to ensure the artifacts are indeed there. So as it is today, we only know the file already exists by calling s3.head_object (which shouldn't take much time at all) (https://github.com/aws/aws-sam-cli/blob/bfbcc166709dba330310088e5a5e520da1ef2444/samcli/lib/package/s3_uploader.py#L196).

We could probably optimize this further somehow by caching some additional info.

Maybe these checks could be done in parallel or something similar?

Parallel probably isn't the best idea, as you can run into race conditions in uploading and checking if the file exists.

Will update the labels. We can look at a PR in anyone is interested here but not something we are currently on our Roadmap.

If you are interested in submitting a fix, we will want to make sure however this is cached is reproducible and avoids not uploading something we should have.

konarskis commented 2 years ago

Thanks for looking into this guys. This is a real blocker for us, as it's taking nearly 50 minutes to deploy the stack at times... Any way this can be prioritized?

We'll have to switch over to CDK otherwise I'm afraid, as the release times are just too long.

Thanks again

jfuss commented 2 years ago

@konarskis If this is blocking you, you are free to submit a PR to patch.

mbklein commented 1 year ago

I started looking into this issue because I was experiencing the same thing @konarskis was. It turns out it's not the head_object check that's taking so long – it's the fact that sam build puts each function into its own directory within .aws-sam/build, and then sam deploy zips up each of those directories in turn and then checks to see if the resulting file already exists in s3.

So for two functions, functionA and functionB, with the same CodeUri and Runtime, it has to:

  1. Zip up .aws-sam/build/functionA and get the md5 of the result
  2. Check to see if a file with that md5 has already been uploaded
  3. It hasn't, so upload it
  4. Zip up .aws-sam/build/functionB and get the md5 of the result
  5. Check to see if a file with that md5 has already been uploaded
  6. It has (because it's identical to functionA), so skip it

It's Step 4 that's the time consuming one for each function, not Step 5.

The solution would seem to be a more optimized build process where the target directory is .aws-sam/[artifact_hash], where artifact_hash is the md5 hash of the function's CodeUri + Runtime. That would allow some deduping of the build as well as the upload. But it's a pretty major change to how things work, and I don't know the codebase well enough to predict what kind of knock-on effects it might produce.

@jfuss, if you think this approach is worth pursuing, I'd be willing to take a run at it, but I didn't want to forge ahead without some idea that such a refactor would be welcome/considered.

mbklein commented 1 year ago

I guess a less invasive version of that would be to create a dict that maps each function's resource ID to a hash of their CodeUri + Runtime and use that to decide which ones to upload. Maybe I'll give that a go and submit a PR.

tburkehitbox commented 3 months ago

Hey all, just checking in to see if there's been any progress or workarounds for this. As our code grows, we've been experiencing 25min waits for these checks and deployment. Thank you very much!

mbklein commented 3 months ago

I started working on the idea I posted about two comments up and never got anywhere with it. (I can't remember now if it was a technical issue or just me having to reprioritize.)

gspeare commented 2 weeks ago

This dominates our deploy time (30 minutes+ at present), would be a big help to improve this time.