Open ronaldtse opened 3 years ago
Ping @skalee since this is relevant to you too.
This is neat!
I’m running tests with S3 uploads to acceleration-enabled and normal S3 buckets, but no strong results yet.
And when the Lambda function uploads to S3, we should use parallel processes for maximum speed (https://github.com/riboseinc/terraform-aws-s3-cloudfront-website/issues/29#issuecomment-789722054).
Module creation now located here: https://github.com/riboseinc/terraform-aws-lambda-s3-archive-extract-upload/issues/1
And when the Lambda function uploads to S3, we should use parallel processes for maximum speed (https://github.com/riboseinc/terraform-aws-s3-cloudfront-website/issues/29#issuecomment-789722054).
Just to clarify, parallel uploads I mentioned only apply in the case where we upload individual files, not in the zipped case. As I mentioned, I believe those are mutually exclusive
Yes, terraform-aws-lambda-s3-archive-extract-upload#1 Step 3 describes using concurrency when the Lambda function uploads to S3 after unzipping.
@ronaldtse should I start implement lambda / find a solution for this one?
@phuonghuynh any solution is fine, as long as it works. Thanks!
This post gives a good introduction to Lambda-based zip file extraction, based on seek (so you don't need to expand the whole archive at once, since a Lambda only has 500MB disk space):
https://alexwlchan.net/2019/02/working-with-large-s3-objects/
The idea here builds on top of #30 - we have a bucket with this structure:
This work involves:
/uploads/
directory (path) (via SNS, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html#notification-how-to-event-types-and-destinations)/uploads/{name}.zip
, extract its content locally, then push its contents to/{name}/
/{name}/.done
so that the Lambda function in #30 knows to update the latest valid copy.Suppose we wish to upload a
20210401000000.zip
.The process goes:
20210401000000.zip
to/uploads/
20210401000000.zip
is uploaded, the Lambda zip extraction function is triggered, and it extracts20210401000000.zip
into a local disk directory. Then, it uploads the content to S3 with correct MIME types./{name}/.done
file to mark that upload is complete.Then I am not sure who should do the CloudFront invalidation. There are two choices:
aws s3 ls
) that the archive extraction and upload from Lambda is complete (via monitoring the presence of the/{name}/.done
file). This has the benefit of ensuring that the GHA build flow only succeeds when the deploy actually succeeds./{name}/.done
is created. But this way the GHA build flow won't know that deployment has failed, because the GHA flow would already succeed when the initial zip archive upload is complete. Unless the user also monitors the deployment until the end, which by then the user can ask CloudFront to invalidate anyway.Thoughts @phuonghuynh @strogonoff ?