Closed sanathkr closed 2 years ago
For python builds, our internal build tool creates a MD5 of the requirements.txt
file and stores it in a hidden file which can be used next time to check if the file has changed.
Here's the steps that our build tool takes to avoid running pip
every time.
deps
and build
folder if not already present.deps
folder.pip install -r requirements.txt -t deps/
build
folder.deps
and function source code into build
.build
.You could maybe store the requirements.txt MD5 under .aws-sam/build/.MyFunction.manifest.md5
or something...
Could you maybe post this source code (even though the instructions are helpful)
Where do we stand on this one? Would love to see this supported for rapid development and iteration.
ping @sanathkr
@ranman This is a good feature to have, but hard to support across all programming languages. We could probably start with
@billyshambrook's suggestion is a good one and can be implemented fairly easily in the Python build workflow (https://github.com/awslabs/aws-lambda-builders/blob/develop/aws_lambda_builders/workflows/python_pip/workflow.py#L67).
It would be more valuable, if we can get some design thinking around support this across programming language. There are several levels of incremental build we can support:
For each of the cases above, we need a different mechanism to know when to rebuild. It would be great if someone can do the thinking and submit a design document PR. This will really help move this forward.
References
sam build
The other way I have done this is by utilizing docker, and create a docker image for the build. This way you can use docker layer cache. I have found its also a workaround if you don't have the ability to mount volumes like in some CI tools, as you are just using COPY.
However this is a bigger change, and requires to always use docker to build.
This would then make it easier to support cross language.
@sanathkr - for interpreted languages I think #3 is the most common thing for me - I just change one line of code but end up rebuilding all the deps as well. Hopefully someone has the bandwidth to figure this out.
I've created a Makefile with a .PHONY target package
that will sam build
and sam package
only on updates.
Seems to work for me. I'm happy to hear feedback.
https://gist.github.com/jghaines/5b5b2530bf21b0a1adab3f6683aa7af5
we're currently getting frustrated with the sam build process, it only gets worse overtime. we had to go thru waiting, wasting 10-15 minutes per build. so if were going to have 5-6 builds for each change we have (even if its a single line), we had to go thru the entire process of the sam build and re-download all python dependencies all the time
(SAMPIC attempts to do something similar to this.)
Hey guys,
So, we created a small npm package based in nodemon (samwatch):
https://www.npmjs.com/package/samwatch
https://github.com/mxitgo/samwatch
The way it works is, it copies your js/json files as you save them from the source folder to the corresponding .aws-sam/build folder (as long as the names are the same for your lambda and the code uri in template.yaml).
That is useful to see your changes reflected on the fly if you are running sam local start-api, so you don't have to run sam build after every change.
On the other hand, if the corresponding file copy is not within the .aws-sam/build folder already, the package will trigger a sam build (unless you use n parameter).
The package has helped us get a hot reload feeling for our local lambda function development with sam.
I hope anyone can find it useful.
Thanks!
Here's another perspective. I'm using node.js, but the issue appears to be the same. The key issue is how long it takes every time a developer makes a single code change and wants to run and test it.
Here is a simple way to reproduce the issue: 1) run 'sam init' 2) choose 1,1 (sets up from a node.js template) 3) choose a project name 4) choose 8 - Quick Start: Web Backend 5) at the command prompt, choose sam build
The sample project is set up. Each build takes over 30 seconds. There are 3 lambda functions in this sample project. It builds and copies everything every time a developer makes a change and wants to test it. This build time only continues to increase as you add more to the project.
When it builds, it copies everything into each build every time for every lambda function whether it was changed or not: 1) the entire project gets copied into each lambda function 2) any dependencies (and their dependencies) also get put into each build folder
So for a simple node.js 'hello world' function, over 50 MB is copied into each function. When the project is deployed to aws lambda, each lambda function is so large that you can't see the code for the lambda function once it's deployed. It also appears to be slower to run as the deployment is so large and needs to be loaded first.
In the process above, these are just the defaults that ship with sam (which could possibly be improved to reduce the deployment size). Each project has different needs, and in an effort to improve this for my project I wasn't able to find any documentation that shows how to control sam build to include only the required files.
Here are some observations: 1) if the dependencies are completed removed from the package.json file, the build speeds up significantly: "dependencies": { "aws-sdk": "^2.437.0" }, "devDependencies": { "jest": "^24.7.1" }, 2) if all the files are deleted from the build directory after a build except for the actual node.js lambda functions, the project runs fine. Because sam build still copies everything into every lambda function 3) with all the node modules and extra files are deleted, there is one file left. The lambda function code itself. If this is deployed to aws lambda, the code runs. It loads and runs much faster. It also is viewable (and editable) in aws lambda
Here are suggestions: 0) only update what has changed (or provide a build option for this) 1) thoughtful defaults in the sample templates. The templates are used by people to learn how to use sam. The package.json file seems to include dependencies (such as the aws-sdk) that aren't necessary for development and testing. Set up the initial sam template build process so that it is fast out of the box 2) sam build could inspect the lambda handlers to see what is actually required to be imported and only copy the required files into each lambda build folder 3) offer more control (or better documentation) on what gets copied into the build directory after a code change. Code changes should be fast to test. Right now they are slow. 4) offer more control (or better documentation) over what gets deployed to AWS lambda. By default it's sending everything in the project plus any dependencies whether they are need or not
I hope this helps! I believe that a faster development cycle would help reduce developer frustration and increase adoption for aws sam.
I'm writing a ruby lambda, and wanted to test is via sam local start-api
then discovered it doesn't watch for changes, and then when I went to do a sam build
it appeared to pull the base image every time?
I'm going to give @eamarce plugin a try. [update] I guess that is only for javascript based projects.... I'm going to give @alexdilley suggestion a try. [update] This project appears out of date, and does more than just sync files I'm going to try and implement my own syncing fix
I only have a single function right now. As far as I understand, I just need to move code changes into the .aws-sam/build/<MyLogicalIDLambda>
sam local start-api
and I update my code in the .aws-sam/build
will I need to stop and start the web-server or will it pick up the changes?It will pick up the code changes immediately. That means I just need to write a file to sync over my changes.
My first attempt is just to create a script called ./bin/local
which will copy over the file and run the local web-server. This works, but I'll have to stop and start the script to see changes.
#!/usr/bin/env bash
set -e
cp function/function.rb .aws-sam/build/EvaluatorFunction/function.rb
sh ./bin/build
echo "== SAM local start-api..."
sam local start-api
I was looking for a bash only solution, but everything pointed me to install something, so if thats the case I'm just going you se Ruby or NodeJs. Since Ruby is my go to I can use either listen or this lightweight watcher called filewatcher. Lets try the latter.
So I wrote a small ruby script called .bin/watch
and this works really well:
Just had to install gem install filewatcher
. I'm on a Mac so ruby is preinstalled.
#!/usr/bin/env ruby
require 'fileutils'
require 'filewatcher'
puts "== watch ..."
root = File.expand_path('function')
build = File.expand_path('.aws-sam/build/EvaluatorFunction')
path = File.expand_path('function/*')
puts "watching: #{path}"
Filewatcher.new([path]).watch do |changes,event|
puts "#{event}: #{changes}"
filename = changes.sub root, ''
case event
when :updated then FileUtils.cp(changes, build+filename)
when :created then FileUtils.cp(changes, build+filename)
#when :deleted
end
end
sam build
now allows for both parallel and cached builds. This would mean faster builds on non changed source code, so that it can re-use previous artifacts, at the same time, multiple functions based on the same template can be built at the same time.
SAM incremental build feature is out in Beta! Run sam build --cached --beta-features
to use it. Currently, we support Python, NodeJS and Ruby runtimes.
To learn more about Sam Accelerate, check out the blog here and the video here.
@praneetap that's awesome that the team has launched this feature. I think the sam experience has been great but the build times have been a major pain point.
By the way the blog article you linked doesn't mention you have to pass the --beta-features
flag to take advantage of the incremental builds and it looks like it is still required to get the output described in the post.
Anything on the roadmap to speed up sam deploy
?
@praneetap that's awesome that the team has launched this feature. I think the sam experience has been great but the build times have been a major pain point.
By the way the blog article you linked doesn't mention you have to pass the
--beta-features
flag to take advantage of the incremental builds and it looks like it is still required to get the output described in the post.Anything on the roadmap to speed up
sam deploy
?
i would love to hear the specific issues you have with deploy! Is it just the time taken? We recommend using SAM sync during development phase, which has faster deployment times, and using SAM deploy in your pipelines.
Thank you @praneetap I was not aware of the sam sync
command I just read about it in the post and it seems like it will definitely help. Previously I actually wrote my own script (based on another example I found) to update lambdas directly since deploy
took so long. Seems like the --code flag may achieve this, though it's unclear if I can just a build a single lambda and then use --code (which is what I can do with my script and which greatly speeds things up) (ie i do sam build ResourceName && ./updateLambda.sh ResourceName
which is a lot fast than build --parallel
over the whole project.
Still I will check this out. My main issue with sam deploy
is the speed. Deploying a new table or something takes like 15-20 minutes on the whole (including build) and my project isn't even that large. Seems like most time is spend uploading (mostly unchanged) artifacts.
Maybe sync
will help by ignoring the cloudformation changeset, I'll have to see.
Two items I'd like to see that I think would be really helpful for local development:
1 - Some sort of linting to fail right away if I've done something in the template that is not supported (like if I add two GSIs or something). Right now it uploads all the code (takes like 5-10 minutes), generates changeset, then fails and tears everything down. Seems like it would be possible to determine the failure before uploading all the code and artifacts just based on the template.yml
.
2 - I can't build a single resource and have it work locally with sam local start-api
. Maybe this is supposed to work and something specific with my configuration but let's say I change a single lambda and I have 20 lambdas in my project, I believe when I run sam build resourcename
it actually deletes all the other lambdas locally so then when I use the local api I start getting module not found errors for some of the deleted containers. It would be nice if it didn't remove all the unchanged containers.
I'm happy to go into detail on any of the items mentioned or forward other suggestions. I didn't want to derail so I stuck to sam build / deploy items above but I also have issues with testing changes locally which require events triggered from Dynamo Streams/SQS.
Thank you and team for sam accelerate. I think this is a great thing for aws development.
Description:
sam build
does a clean build every time. This can slow down the local dev-test loop because it will be downloading and installing dependencies on each build.An incremental build will only reinstall dependencices when your dependency manifest changes and copy only relevant pieces of code that changed. So you can get a really fast dev-test cycle locally without compromising on the build capabilities. This could be implemented as a
sam build --watch
which will watch files for changes and perform an incremental build.