aws / aws-sam-cli

CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM
https://aws.amazon.com/serverless/sam/
Apache License 2.0
6.49k stars 1.17k forks source link

clean up the aws-sam-cli-managed-default bucket artifacts #2980

Open aprilmintacpineda opened 3 years ago

aprilmintacpineda commented 3 years ago

Description:

Steps to reproduce:

Create a project and deploy to a stage multiple times.

Observed result:

You'll see multiple artifacts but seems not needed anyways.

Expected result:

The s3 bucket should be kept cleaned.

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: N/A
  2. sam --version: Latest
  3. AWS region: N/A

Add --debug flag to command you are running

After months of using AWS sam my bucket has reached 15 GB in file size, so I wondered why AWS SAM simply leaves the default bucket bloated with unneeded files.

jfuss commented 3 years ago

@aprilmintacpineda The short answer is we don't know what artifacts to leave around and remove. The CLI uses this as a place to put artifacts and don't have a way to remove others. You can add an appropriate, to your use-case, lifecycle policy to remove old artifacts. The managed stack is created under deploy right now, so I think we would need to consider bring up the sam bootstrap command (command to help manage the managed stack). We could add a way to define a lifecycle policy so this doesn't happen or limits it.

For now, it's up to the customer to keep it clean. I know that isn't the best answer :(.

wayne-folkes commented 2 years ago

Being able to define a lifecycle policy for the buckets would great. Maybe appropriate for another ticket but being able to define a lifecycle policy for the ECR repos for the container images that my container based lambdas use would be great too.

rovellipaolo commented 1 year ago

I would like to "revive" this issue...

You can add an appropriate, to your use-case, lifecycle policy to remove old artifacts.

As far as I've understood it (might be very wrong on this), at present, there is no way to write such a lifecycle rule. At least not for generic use-cases.

The point is that sam deploy changes the artifact names at every execution (see: https://github.com/aws/serverless-application-model/issues/557). So, at most, one could create a lifecycle rule to delete all the artifacts that are older than X days. But being "older than X days" per-se does not guarantee that the artifacts are actually "old" / "unused". Indeed, if for whatever reason one hasn't deployed the CloudFormation stack in the last X days, then those artifacts are actually the currently used ones and deleting them might break an eventual rollback in the next deploy (see: https://github.com/awsdocs/aws-sam-developer-guide/pull/3#issuecomment-462993286).

POSSIBLE WORKAROUND: If I'm not mistaken, doing a sam deploy with the previous version of the code (i.e., the one already deployed) before doing it with the new version (i.e., the one to be deployed) should end up in an "empty changeset" but still re-upload the original artifacts. It's "hacky" and far from ideal, but should theoretically fix the previously-mentioned issue with rollbacks.

The managed stack is created under deploy right now, so I think we would need to consider bring up the sam bootstrap command (command to help manage the managed stack). We could add a way to define a lifecycle policy so this doesn't happen or limits it.

That would be awesome! :)

Just thinking out loud (without knowing the actual complexity of this on your side) but, alternatively, you could add an option to sam deploy to use static or user-defined artifact names. Indeed, by using always the same artifact names, one could setup an S3 bucket with versioning and have a lifecycle rule to automatically delete noncurrent versions of the artifacts while always retaining the current ones (see: https://aws.amazon.com/blogs/storage/reduce-storage-costs-with-fewer-noncurrent-versions-using-amazon-s3-lifecycle/).

POSSIBLE WORKAROUND: I know this could already be achieved (at least partially) by packaging and uploading the Lambda Function manually, and then referencing the uploaded artifact version in the SAM template.

Something like:

CodeUri:
  Bucket: "my-versioned-s3-bucket"
  Key: "my-lambda-function.zip"
  Version: !Ref MyLambdaFunctionS3ObjectVersion

Where the MyLambdaFunctionS3ObjectVersion value can be retrieved for example while uploading the object via:

$ aws s3api put-object --body my-lambda-function.zip --bucket my-versioned-s3-bucket --key my-lambda-function.zip | grep -oP '(?<="VersionId":\s").+(?=")'

Or by querying the current version of the artifact via:

$ aws s3api head-object --bucket my-versioned-s3-bucket --key my-lambda-function.zip | grep -oP '(?<="VersionId":\s").+(?=")'

But it's far from ideal and, above all, one would lose the ability to test the CloudFormation stack locally as sam local invoke and sam local start-api do not currently support CodeUri pointing to an S3 location (see: https://github.com/aws/aws-sam-cli/issues/2802).