mozmeao / infra

Mozilla Marketing Engineering and Operations Infrastructure
https://mozilla.github.io/meao/
Mozilla Public License 2.0
59 stars 12 forks source link

Determine CloudFront CDN settings for MDN stage #689

Closed escattone closed 6 years ago

escattone commented 6 years ago

This issue exists to capture any discussion on the task of determining the settings for a new CloudFront CDN instance in front of the MDN stage deployment.

escattone commented 6 years ago

@metadave @jwhitlock @jgmize

The following issue arose from my assumption that in order to place a CDN in front of MDN stage (https://stage.mdn.moz.works) or prod (https://developer.mozilla.org) as they currently exist, we'd have to be able to vary the content caching based on selected cookies (in particular, I was thinking of the Waffle cookies, dwf_*, and whether or not the user was logged in, sessionid).

After some research on CloudFront's settings and handling of cookies for caching, I was really surprised that if I specify a whitelist of cookie names I'd like to use for caching purposes (as part of the cache key, so to speak, combined with the path of the URL), CloudFront also seems to use that whitelist to restrict what cookies are forwarded to the origin. I'd like the whitelist to only influence caching, not restrict what gets forwarded to the origin. For example, the user may make a request containing the cookies _ga, dwf_section_edit, dwf_sg_task_completion, csrftoken, and sessionid. It seems that the content of the response will vary based only on the dwf_* and sessionid cookies, so I'd like to add those cookies to the whitelist for caching. However, if I do that, it seems that CloudFront will also strip the _ga and csrftoken cookies before forwarding the request to the origin, but the origin needs those cookies for other reasons.

bookshelfdave commented 6 years ago

Should we consider Google's CDN as well if AWS won't cut it?

bookshelfdave commented 6 years ago

From top to bottom, here's how DNS would work:

   ┌────────────────────────────────┐   
   │  developer.mozilla.org  CNAME  │──┐
   └────────────────────────────────┘  │
                                       │
                                       │
   ┌────────────────────────────────┐  │
┌──│ Cloudfront Distribution        │◀─┘
│  └────────────────────────────────┘   
│                                       
│ Origin                                
│  ┌────────────────────────────────┐   
└─▶│ prod.mdn.moz.works  ALIAS      │──┐
   └────────────────────────────────┘  │
                                       │
                                       │
   ┌────────────────────────────────┐  │
   │ K8s ELB                        │◀─┘
   └────────────────────────────────┘   
jwhitlock commented 6 years ago

@metadave put up a practice CDN at https://test-stage-cdn.mdn.moz.works/en-US/. Feel free to adjust settings and see what works.

When I curl -I https://test-stage-cdn.mdn.moz.works/en-US/ I see two cookies:

set-cookie: dwf_section_edit=False; expires=Sat, 10-Feb-2018 21:42:44 GMT; Max-Age=2592000; Path=/; secure
set-cookie: dwf_sg_task_completion=False; expires=Sat, 10-Feb-2018 21:42:44 GMT; Max-Age=2592000; Path=/; secure

dwf_section_edit may need a settings adjustment. I think we have it as a 50% experiment in stage, but there's no coin flip in production.

dwf_sg_task_completion is a 5% waffle flag. We'll need to change the implementation for a CDN (5% waffle sample, coin flip on the client side, etc.)

We probably don't have Google Analytics active on staging, but we could potentially enable it. I think the _ga cookie is generated client-side, and we don't use it server-side.

I think we set Cache-control: no cache or similar for logged-in users, so that anything with a session cookie isn't saved by the CDN, at least for round 1. In a similar way, pages with CSRF should probably not be cached.

jwhitlock commented 6 years ago

Closing incomplete. Tracking is moving to Taiga User Story 3938.