aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.6k stars 3.9k forks source link

(s3-deployment): Support large s3 bucket deployments #7950

Open iliapolo opened 4 years ago

iliapolo commented 4 years ago

We want to enable users to deploy large amounts of data onto s3 buckets. Currently, such deployments frequently fail because we are limited by the lambda execution environment:

See https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html

Use Case

Users use the s3-deployment module to deploy static websites to an s3 bucket. On many occasions, the contents of the website directory can be quite big, and contain many files.

See:

Proposed Solution

No clear solution to this problem yet, this issue is created so we discuss various approaches.


This is a :rocket: Feature Request

bitfrost commented 3 years ago

I have forked this package locally and added support for an efs mount associated with the lambda to get around the 500mb limit. Would the maintainers be open to a PR that included that as an opt in option via a prop?. This would not solve 'copy time issues around 15 minute lambda execution limit , but it would increase the deployment limit considerably'. If so please let me know.

iliapolo commented 3 years ago

@bitfrost Yes a PR would be greatly appreciated, thanks! :)

bitfrost commented 3 years ago

https://github.com/aws/aws-cdk/pull/12361 I will need some help into understanding the codebuild error (cannot get to the actual logs) and a suggestion on how to add an appropriate test to this. See my comments, I have packaged the package locally and am using it internally.

iliapolo commented 3 years ago

Some work has been done to solve the 500MB limitation by allowing to attach an EFS volume. See https://github.com/aws/aws-cdk/pull/12361.

That PR has gone stale and is now closed, but if anyone want to pick it up i'll be happy to merge it.

iDVB commented 3 years ago

Anyone get to the bottom of why we could be getting Signals.SIGKILL: 9 when our dataset is less than 100MB?

keshav0891 commented 3 years ago

Anyone get to the bottom of why we could be getting Signals.SIGKILL: 9 when our dataset is less than 100MB?

Try increasing the memory allocation to your lambda function. Default limit is 128MB which might be low for your usecase.

automartin5000 commented 2 years ago

Edit: Whoops wrong link https://twitter.com/julian_wood/status/1465842874457763840 Problem solved 😄

redrobotdev commented 1 year ago

Anyone get to the bottom of why we could be getting Signals.SIGKILL: 9 when our dataset is less than 100MB?

Try increasing the memory allocation to your lambda function. Default limit is 128MB which might be low for your usecase.

@keshav0891 you mean the CloudFrontWebDistribution construct?

aiden-sobey commented 1 year ago

@ababakanian the memory limit needs to be increased on the custom resource that deploys to the S3 bucket behind your CloudFront distribution.

import * as deployment from 'aws-cdk-lib/aws-s3-deployment';

        new deployment.BucketDeployment(this, 'deploy', {
            destinationBucket: this.bucket,
            distribution: this.distribution,
            distributionPaths: ['/*'], // Invalidate the CloudFront cache on upload
            sources: [
                deployment.Source.asset(buildPath),
            ],
            memoryLimit: 1024,
        });

You want the last line in that, memoryLimit.

github-actions[bot] commented 1 year ago

This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.

ttais2017 commented 1 year ago

Currently I'm working with KinesisAnalyticsApplication and IaC with Java. In such cases, the JAR Files/applications must be FatJARs (with all dependencies). Most cases those JARs are at least 90MB. In my usecase I include three simple applications and the related JAR files are in sum around 320 MB. Additionally I want to deploy some other resources which must be available in S3.

The Stack I'm using to deploy my full use case is broken, as soon the first JAR application (106MB) try to be deployed on S3

Kind Regards,

keshav0891 commented 1 year ago

@ttais2017 Have you tried using efs support as explained in readme?

You might want to increase the memoryLimit for your lambda handler for better performance if simply using efs support doesn't work. Please refer the comment by @aiden-sobey

ttais2017 commented 1 year ago

Hi, Thanks for your comments. I increased the memoryLimit, but it does not help. only increasing this value i got the error:

Received response status [FAILED] from custom resource. Message returned: [Errno 28] No space left on device (RequestId: 735065ff-9af8-4634-ade0-b8e5e61c1364)

I could not find the link called "readme". could you provide me more info, how can I define efs for the deployment bucket ?. I used EFS for other lambdas.. but for deployment bucket ?.

keshav0891 commented 1 year ago

@ttais2017 My bad. I have updated the readme link. The changes will look something like below.

new s3deploy.BucketDeployment(this, 'DeployMeWithEfsStorage', {
  sources: [s3deploy.Source.asset(path.join(__dirname, 'my-website'))],
  destinationBucket,
  destinationKeyPrefix: 'efs/',
  useEfs: true,  //This is the flag to enable efs storage.
  vpc,
  retainOnDelete: false,
});
ttais2017 commented 1 year ago

Thanks a lot for your fix :)
I tried that and it works. goood!!.. By trying I found also a new option for deployment.. which is "ephemeral storage". Great to know more deeper details in BucketDeployment!.

Kind Regards,

.-- Miguel

m1n9o commented 1 week ago

I got similar issue. But my case is to use CDK to deploy Lambda by using zip package deployment approach.

So CDK will create a new Lambda called CustomCDKBucketDeployment.. to upload the zip package. However my zip package is pretty large, it's like 100MB. I got an error like.

[ERROR] 2024-10-09T05:06:13.743Z    df2124e8-4d6b-43c1-88f1| cfn_error: Command '['/opt/awscli/aws', 's3', 'cp', 's3://cdk-bs-phoenix-assets-88-ap-southeast-2-810b3b957.zip', '/tmp/tmpilunpaw4/97e851cd-8806-625f7c8250a5']' died with <Signals.SIGKILL: 9>.

This can not be reproduced stably, but I understood it's because of the memory limit. CustomCDKBucketDeployment... Lambda is created by CDK, and I couldn't change the memory of it. So I wonder how can we fix this problem?

Should I create another issue for this?

gshpychka commented 1 week ago

I got similar issue. But my case is to use CDK to deploy Lambda by using zip package deployment approach.

So CDK will create a new Lambda called CustomCDKBucketDeployment.. to upload the zip package. However my zip package is pretty large, it's like 100MB. I got an error like.

[ERROR]   2024-10-09T05:06:13.743Z    df2124e8-4d6b-43c1-88f1| cfn_error: Command '['/opt/awscli/aws', 's3', 'cp', 's3://cdk-bs-phoenix-assets-88-ap-southeast-2-810b3b957.zip', '/tmp/tmpilunpaw4/97e851cd-8806-625f7c8250a5']' died with <Signals.SIGKILL: 9>.

This can not be reproduced stably, but I understood it's because of the memory limit. CustomCDKBucketDeployment... Lambda is created by CDK, and I couldn't change the memory of it. So I wonder how can we fix this problem?

Should I create another issue for this?

Are you using BucketDeployment? It has a memoryLimit prop if so