Open hleb-albau opened 4 years ago
Hi @hleb-albau
This is definitely an interesting use-case. The problem is that bucket deployments run:
aws s3 sync --delete --content-type=<content-type> {sourceDir} {targetBucket}
The command does not allow specifying different conte-types for different files. Splitting to different source directories won't work either because of the (necessary) --delete
flag.
Are you having issues with anything other than br
files? The aws
cli actually determines the content-type of each individual file automatically, by delegating to python's standard library. However, support for br
file extensions was only just added to CPython, and is not yet released outside of an alpha version.
One possible solution would be to have the bucket deployment lambda add a specific entry for br
files to one of the known mime type linux files, which should make the cli detect it properly, voiding the need to pass Content-Type all-together.
This seems like the most pragmatic solution for now.
WDYT?
Thanks for response!
Beside content-encoding header, which determines content compression(br, gz and etc) we also have content-type header.
Example for file util.js.br
: content-type - application/javascript, content-encoding - br
Right now our deployments process runs as follows:
aws s3 cp ./dist s3://{BUCKET_NAME} \
--exclude="*" --include="*.js.br" \
--content-encoding br \
--content-type="application/javascript" \
--cache-control "max-age=31536000" \
--metadata-directive REPLACE --recursive
So, I wonder if cli can detect both headers properly..
Hi @hleb-albau - Yeah, seems like there's no way around this.
We can probably support this use-case by doing what you did with exclude/include
.
Stay tuned 👍
Thanks!
relates also to https://github.com/aws/aws-cdk/issues/4687
Just to add to this conversation, here is the script I am using to achieve this right now:
# Clear out / upload everything first
echo "[Phase 1] Sync everything"
aws s3 sync . "s3://${s3_bucket_name}" --acl 'public-read' --delete
# Brotli-compressed files
# - general (upload everything brotli-compressed as "binary/octet-stream" by default)
echo "[Phase 2] Brotli-compressed files"
aws s3 cp . "s3://${s3_bucket_name}" \
--exclude="*" --include="*.br" \
--acl 'public-read' \
--content-encoding br \
--content-type="binary/octet-stream" \
--metadata-directive REPLACE --recursive;
# - javascript (ensure javascript has correct content-type)
echo "[Phase 3] Brotli-compressed JavaScript"
aws s3 cp . "s3://${s3_bucket_name}" \
--exclude="*" --include="*.js.br" \
--acl 'public-read' \
--content-encoding br \
--content-type="application/javascript" \
--metadata-directive REPLACE --recursive;
would also be good if the upload detected the file encoding and added a charset
content type that defaulted to utf-8
I solved it with a Custom Resource. This one changes the CacheControl
but the logic is the same for other metadata.
s3_deployment = BucketDeployment(...
copy_object_changing_cache = cr.AwsSdkCall(
service="S3",
action="copyObject",
parameters={
"Bucket":bucket.bucket_name,
"CopySource": f"{bucket.bucket_name}/remote.js",
"Key": "remote.js",
"MetadataDirective": "REPLACE",
"CacheControl": "no-cache, no-store",
"Metadata": {"object-hash": uuid4().hex[:8]} # Important to trigger update in cloudformation
},
physical_resource_id=cr.PhysicalResourceId.of("ChangeObjectCacheControl"),
)
change_cache_role = iam.Role(
scope=self,
id="ChangeCacheRole",
assumed_by=iam.ServicePrincipal(service="lambda.amazonaws.com"),
inline_policies={
"AllowCopyRemoteJs": iam.PolicyDocument(
statements=[
iam.PolicyStatement(
actions=[
"s3:PutObject",
"s3:CopyObject",
"s3:GetObject",
"s3:DeleteObject"
],
resources=[bucket.arn_for_objects("remote.js")],
effect=iam.Effect.ALLOW,
),
],
)
},
)
change_cache = cr.AwsCustomResource(
scope=self,
id="ChangeObjectCacheControl",
role=change_cache_role,
on_create=copy_object_changing_cache,
on_update=copy_object_changing_cache,
)
change_cache.node.add_dependency(s3_deployment)
Just throwing in my particular use-case (although this is certainly solvable using other means, including the ones already on this ticket)
Many single page app frameworks (Angular, React, etc) build in such a way that the various JS and CSS resources have hashed names and can therefore be cached roughly forever, but the root document (typically index.html) will be the same and therefore should have a very short or empty cache length during times of heavy development. Being able to mark index.html to have a max-age of 0 and everything else to have a much longer cache age could allow a downstream Cloudfront distro to use the S3 cache headers, if they could be set this way.
Currently, there is contentEncoding?:string option in BucketDeployment construct (System-defined content-encoding metadata to be set on all objects in the deployment). It would be nice to have possibility to specify contentEncoding according to mapping for file extension. Example: for files with extension .br specify "Content-Encoding: br", for .gzip files - "Content-Encoding: gzip" and so on.
Use Case
We use s3+ cloudfront pair to serve static website. To provide better performance br files are server based on Accept-Encoding header, so our files have to copies (ex: index.html and **index.html.br*). Currently, we have to use aws cli to deploy differently encoded files with right headers. If BucketDeployment construct will support contentEncoding** encoding by file extension option, it would be more easy-to-go static hosting option.
This is a :rocket: Feature Request