covidatlas / li

Next-generation serverless crawler for COVID-19 data
Apache License 2.0
57 stars 33 forks source link

Set Content-Type headers on S3 objects #567

Closed joliss closed 3 years ago

joliss commented 3 years ago

Note that s3.copyObject should preserve metadata, so this should only be needed on s3.putObject.

S3 sends everything with Content-Type: application/octet-stream by default:

$ curl -o /dev/null -D /dev/stdout -sS https://liproduction-reportsbucket-bhk8fnhv1s76.s3-us-west-1.amazonaws.com/v1/latest/locations.json | grep Content-Type
Content-Type: application/octet-stream

This patch sets it to application/json for .json files, and similarly for .csv. (Apparently you cannot do this automatically, so it needs to be set for every S3 put request.)

I couldn't test this change since I don't have access to the S3 bucket, put I triple-checked the code to make sure it's correct, so hopefully it won't cause any problems!

jzohrab commented 3 years ago

Good catch. On the one hand, I don't think a package was necessary to use (could have just switch (filenameExtension)); on the other, it's likely more future proof, and it's small. Thanks!

jzohrab commented 3 years ago

@joliss, can you double-check staging reports work when they generate later today? Then we'll launch to prod and check later. Thanks! z

joliss commented 3 years ago

Thanks for merging! Yup, it's working fine on staging!

$ curl -I -sS https://listaging-reportsbucket-1bjqfmfwopcdd.s3-us-west-1.amazonaws.com/v1/latest/locations.json | grep Content-Type
Content-Type: application/json
$ curl -I -sS https://listaging-reportsbucket-1bjqfmfwopcdd.s3-us-west-1.amazonaws.com/v1/latest/locations.csv | grep Content-Type
Content-Type: text/csv