Fix Headers from Cloudfront

b-meson commented 6 years ago

Related to #172, we need to add a good CSP and fix the (abysmal) security headers rating we have right now.

b-meson commented 6 years ago

Website is back up. Deployment notes need to go in before we can move to "done". I make some mistakes in the deployment process that we should have caught with careful deployments to staging. Therefore, this issue can't be closed until I write a full debrief. Sorry for the messy downtime.

b-meson commented 6 years ago

Here is my writeup about what issues I ran into while deploying the new CSPs and why it turned into a mess.

Background

When we switched from Digital Ocean to AWS (S3 + Cloudfront) we did not notice our "score" from securityheader.io dropped from an A to an F. In particular, we silently dropped HSTS support and CSPs from our headers because these headers don't really "exist" in Cloudfront.

AWS

The way to apply these headers is to use a custom Lambda@Edge which is not the same thing as a Lambda. The @Edge lambdas can only be created in us-east-1 (N Virginia) even though there are technically three "Edge" sites announced by Amazon as of today. Cloudfront invokes these lambdas as a response to a request from our website.

Lambda@Edge

In order to deploy a Lamba@Edge, you must be aware that the documentation for these functions is often wrong or misleading. Here are some things I discovered:

Deploy them only in us-east-1.
The runtime must be Node.js 6.10
The execution role in the Lambda console be Lambda@Edge (which is not the same as a Lambda IAM role). You can not add a @Edge IAM role to a custom IAM function despite documentation telling you that you can. It should also be noted that the Cloudfront messages will tempt you into looking at IAM roles and tell you to create a new role that includes both lambda and lambda@edge. No such thing exists AFAIK.
Your published function can not have a variable name (i.e. the arn you invoke from Cloudfront must be a versioned number).

CSP

After deploying our new custom CSPs (invoking the lambda from Cloudfront), I tested them against securityheaders.io and saw the staging site was loading resources with an "A" grade. Convinced that was sufficient testing, I copied the lambda from staging to prod. I did not test these in a browser and check for resources loading. In addition, since we tore down our old DigitalOcean servers, we did not have the old nginx config lying around to check against. I naïvely expected that src 'self'; would be sufficient because that's how I recalled them.

Deployment

After realizing my mistake (i.e that the web browser was blocking resources from loading), I disabled traffic to the prod server and continued testing against staging. Around 1am PST, I was able to confirm that the following CSP default-src 'self'; img-src 'self'; script-src 'self' https://lucyparsonslabs.com 'unsafe-inline'; style-src 'unsafe-inline' https://lucyparsonslabs.com; object-src 'self' allowed all resources from our website to load properly. At that time, I updated the prod lambda and re-enabled traffic to production.

tl;dr

We should have tested the security headers after moving to S3 (or maybe put that in monitoring somewhere). I didn't properly test the CSPs by checking the browser loading resources and that's why I took down production while I tested on staging. AWS's documentation in general, and Lambda@Edge in particular, is awful.

lucyparsons / lucyparsonslabs.com