google / ioweb2015

I/O Web App 2015
https://events.google.com/io2015/
Apache License 2.0
686 stars 122 forks source link

Leverage browser caching #640

Open x1ddos opened 9 years ago

x1ddos commented 9 years ago

@paullewis was suggesting that we should leverage browser caching and set cache expiration to something like 1 year instead of 10 min.

To do that we need a cache busting. A couple of gulp plugins:

But then there might be some issues with Service Worker. @jeffposnick WDYT?

jeffposnick commented 9 years ago

I spent some time thinking about this prior to launch w.r.t. SW involvement. With the current sw-precache implementation, there's nothing stopping us from implementing cache-busting right now, but there are some things to consider:

We'd want to minimize parameter churn, which would mean using gulp-buster (which supports hash-based busting) rather than something that uses a changing timestamp each build.

Any time the parameter changes, any files which include a resource with a changed parameter will also get expired from the SW cache. In practice, this means our cached full-page HTML resources will all be expired if the concatenated JS file they depend on changes. Those HTML files are relatively small, so there's not much overhead, but it would be something to consider if our dependencies were slightly different.

I was thinking of adding in logic in the SW that would ignore the cache-busting parameter when determining which resources to serve from the SW cache (relying instead on the file-based hash logic that the SW already uses), but that would only be necessary if we were using a timestamp that incremented each build. As long as we go with a hash-based timestamp, I am pretty sure we'll be :+1: with the current version of sw-precache.

Let's test things a lot before deploying, obviously.

jakearchibald commented 9 years ago

After a quick glance, I agree with @jeffposnick that gulp-buster is the better of the two.

SW leans on the http cache, so better http caching usually results in an simpler ServiceWorker

jeffposnick commented 9 years ago

Under the hood, sw-precache is appending its own cache-busting URL parameter, so I don't think we'll see as much benefit for users with SW-enabled browsers.

It's definitely something we should do to benefit browsers that don't have SW support.

I'm going to be chatting with @jakearchibald @slightlyoff (and @wibblymat, if he'd like) about various cache scenarios tomorrow, and while we'll probably go ahead with what's outlined in this Issue, I'd rather hold off on implementation until after that chat.

paullewis commented 9 years ago

Of course, I'd rather we did things right rather than hasty :+1:

jeffposnick commented 9 years ago

We should be good to go ahead with adding in cache busting into our base URLs and bumping up the lifetime of most of our caches.

The one exception that comes to my mind are the resources under /experiment/, which are currently being built outside our main gulp process. We can either leave those as-is and keep the /experiment/ cache lifetime relatively short, ask Instrument to modify their build process, or I can look into modifying their build process after we wrap up the changes for the rest of the site.

I'm going to assign this to myself to track the changes to the gulp build, but @crhym3, it would be great if you could actually fiddle with the App Engine settings.

ebidel commented 9 years ago

We could also bump up the cache headers on everything under /experiement. That won't be changing at all AFAIK. cc @tdreyno

tdreyno commented 9 years ago

Yup, blow out those experiment headers.

As an aside, the shorter your caches (or ease in busting all your caches), the happier you'll be during the run of the conference. Everybody everywhere is asking for changes all over the place 5 minutes ago. Even behind the 10 minute cache we had last year, it was a complete pain to fix bugs or swap assets.

Maybe this year will be simpler, but the live streams and timing them are some pretty complex code and tend to have a handful of bugs that avoid QA. Or Youtube streams fall apart and have to be replaced. Last year, we had a null reference exception 15 minutes before the keynote started which SUCKED to find/debug/fix behind the cache.

So, plan ahead and balance live site perf and the ability to actually fix bugs during the conference :)

ebidel commented 9 years ago

Good feedback @tdreyno! Closer to the conference, things will churn again. URL cache busting will definitely help if we have to do emergency pushes.

jeffposnick commented 9 years ago

I've picked this up again. Trying things out locally with gulp-cachebust, I've come across the following wrinkles:

Those considerations aside, I think we could go ahead with an scaled-back approach in which resources under styles/, images/, and elements/webgl-globe/ were renamed based on their MD5 hashes, and we extended the HTTP cache expiration for those paths. We'd stick with a shorter cache lifetime for the HTML and JS.

How does that sound to folks?

ebidel commented 9 years ago

Have we thought about just appending the push timestamp onto urls that need to be busted?

/styles/main.min.css?234567890

That could be a gulp task that runs over the dist/ output and adds the param to URLs in the relevant files

jeffposnick commented 9 years ago

Two problems that come to mind with timestamp-based cache busting params:

ebidel commented 9 years ago

The URL busting on timestamp would be a unconditional update for everyone on each release. But I was forgetting about how SW caching plays into it all. Doesn't the URL used in the page not have to match the one stored by the SW cache? I think I'm missing something.

jeffposnick commented 9 years ago

A few more issues that I'm seeing when try this out in a gulp serve:dist environment:

jeffposnick commented 9 years ago

I could have the SW ignore our cache-busting URL parameter if we decide to go that route.

But the main reason we're talking about long-lived browser caches + cache busting is to allow browsers without SW support not to re-download more than they need to. If we're unconditionally updating our cache busting URL parameter for every resource on each deployment, doesn't that defeat the purpose?

x1ddos commented 9 years ago

Using MD5-based file names for our JavaScript or Polymer HTML files is not going to work. Too many of our JS/element HTML files end up referring to other files, and those files' names are going to change based on their contents, with effects rippling out

I don't understand why wouldn't we just bust the cache of the final output, e.g scripts/site-scripts.js? It wouldn't matter who references what.

x1ddos commented 9 years ago

To be precise, I'm talking about:

as in https://events.google.com/io2015/

jeffposnick commented 9 years ago

It might be okay for what gets currently generated in scripts/site-scripts.js, assuming none of the scripts reference another file, but that's just a happy accident. If, hypothetically, picasa.js was updated tomorrow to include a path to a fallback local image, then we'd see the issue.

The problem manifests itself in our current Vulcanized elements.html; it includes a reference (originating from https://github.com/GoogleChrome/ioweb2015/blob/master/app/elements/io-gallery.html#L54) to images/io15-color.png. We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html.

None of this is impossible, but it requires keeping track of which files include references to which resources and properly ordering the transformations. Maybe the transformations could happen in waves, starting with the images:

  1. All of the images get renamed based on their hashes
  2. The contents of the CSS files + all of the JS files get updated to include the new image names
  3. All of the CSS files + JS files get renamed based on their hashes
  4. The contents of the HTML files get updated to include the new image + JS + CSS names

I can try implementing that locally and if/how badly things break.

x1ddos commented 9 years ago

Right. Good point, Jeff. I see where the concerns are. Another thing I remember Angular guys did is just have a long list of all things to bulid from where order mattered.

On 11 March 2015 at 20:54, Jeffrey Posnick notifications@github.com wrote:

It might be okay for what gets currently generated in scripts/site-scripts.js, assuming none of the scripts reference another file, but that's just a happy accident. If, hypothetically, picasa.js https://github.com/GoogleChrome/ioweb2015/blob/master/app/scripts/helper/picasa.js was updated tomorrow to include a path to a fallback local image, then we'd see the issue.

The problem manifests itself in our current Vulcanized elements.html; it includes a reference (originating from https://github.com/GoogleChrome/ioweb2015/blob/master/app/elements/io-gallery.html#L54) to images/io15-color.png. We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html.

None of this is impossible, but it requires keeping track of which files include references to which resources and properly ordering the transformations. Maybe the transformations could happen in waves, starting with the images:

  1. All of the images get renamed based on their hashes
  2. The contents of the CSS files + all of the JS files get updated to include the new image names
  3. All of the CSS files + JS files get renamed based on their hashes
  4. The contents of the HTML files get updated to include the new image
  5. JS + CSS names

I can try implementing that locally and if/how badly things break.

— Reply to this email directly or view it on GitHub https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78369275 .

jakearchibald commented 9 years ago

Isn't the answer to do all the file renaming first, build a map of the before/after, then update references in all files?

On Wed, 11 Mar 2015 17:26 alex notifications@github.com wrote:

Right. Good point, Jeff. I see where the concerns are. Another thing I remember Angular guys did is just have a long list of all things to bulid from where order mattered.

On 11 March 2015 at 20:54, Jeffrey Posnick notifications@github.com wrote:

It might be okay for what gets currently generated in scripts/site-scripts.js, assuming none of the scripts reference another file, but that's just a happy accident. If, hypothetically, picasa.js < https://github.com/GoogleChrome/ioweb2015/blob/master/app/scripts/helper/picasa.js

was updated tomorrow to include a path to a fallback local image, then we'd see the issue.

The problem manifests itself in our current Vulcanized elements.html; it includes a reference (originating from

https://github.com/GoogleChrome/ioweb2015/blob/master/app/elements/io-gallery.html#L54 ) to images/io15-color.png. We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html.

None of this is impossible, but it requires keeping track of which files include references to which resources and properly ordering the transformations. Maybe the transformations could happen in waves, starting with the images:

  1. All of the images get renamed based on their hashes
  2. The contents of the CSS files + all of the JS files get updated to include the new image names
  3. All of the CSS files + JS files get renamed based on their hashes
  4. The contents of the HTML files get updated to include the new image
  5. JS + CSS names

I can try implementing that locally and if/how badly things break.

— Reply to this email directly or view it on GitHub < https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78369275

.

— Reply to this email directly or view it on GitHub https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78375417 .

x1ddos commented 9 years ago

I guess both are the same thing. Building a map is essentially an implementation of Jeff's transformation in ordered steps. No?

On 11 March 2015 at 21:45, Jake Archibald notifications@github.com wrote:

Isn't the answer to do all the file renaming first, build a map of the before/after, then update references in all files?

On Wed, 11 Mar 2015 17:26 alex notifications@github.com wrote:

Right. Good point, Jeff. I see where the concerns are. Another thing I remember Angular guys did is just have a long list of all things to bulid from where order mattered.

On 11 March 2015 at 20:54, Jeffrey Posnick notifications@github.com wrote:

It might be okay for what gets currently generated in scripts/site-scripts.js, assuming none of the scripts reference another file, but that's just a happy accident. If, hypothetically, picasa.js <

https://github.com/GoogleChrome/ioweb2015/blob/master/app/scripts/helper/picasa.js

was updated tomorrow to include a path to a fallback local image, then we'd see the issue.

The problem manifests itself in our current Vulcanized elements.html; it includes a reference (originating from

https://github.com/GoogleChrome/ioweb2015/blob/master/app/elements/io-gallery.html#L54 )

to images/io15-color.png. We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html.

None of this is impossible, but it requires keeping track of which files include references to which resources and properly ordering the transformations. Maybe the transformations could happen in waves, starting with the images:

  1. All of the images get renamed based on their hashes
  2. The contents of the CSS files + all of the JS files get updated to include the new image names
  3. All of the CSS files + JS files get renamed based on their hashes
  4. The contents of the HTML files get updated to include the new image
  5. JS + CSS names

I can try implementing that locally and if/how badly things break.

— Reply to this email directly or view it on GitHub <

https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78369275

.

— Reply to this email directly or view it on GitHub < https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78375417

.

— Reply to this email directly or view it on GitHub https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78379240 .

jakearchibald commented 9 years ago

Oh, I thought the suggestion was to interleave renaming and replacing "We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html"

On Wed, 11 Mar 2015 17:48 alex notifications@github.com wrote:

I guess both are the same thing. Building a map is essentially an implementation of Jeff's transformation in ordered steps. No?

On 11 March 2015 at 21:45, Jake Archibald notifications@github.com wrote:

Isn't the answer to do all the file renaming first, build a map of the before/after, then update references in all files?

On Wed, 11 Mar 2015 17:26 alex notifications@github.com wrote:

Right. Good point, Jeff. I see where the concerns are. Another thing I remember Angular guys did is just have a long list of all things to bulid from where order mattered.

On 11 March 2015 at 20:54, Jeffrey Posnick notifications@github.com wrote:

It might be okay for what gets currently generated in scripts/site-scripts.js, assuming none of the scripts reference another file, but that's just a happy accident. If, hypothetically, picasa.js <

https://github.com/GoogleChrome/ioweb2015/blob/master/app/scripts/helper/picasa.js

was updated tomorrow to include a path to a fallback local image, then we'd see the issue.

The problem manifests itself in our current Vulcanized elements.html; it includes a reference (originating from

https://github.com/GoogleChrome/ioweb2015/blob/master/app/elements/io-gallery.html#L54

)

to images/io15-color.png. We would have to make sure that the hash-based renaming of images/io15-color.png takes place first, then the replacement of that string within elements.html, and then the hash-based renaming of elements.html.

None of this is impossible, but it requires keeping track of which files include references to which resources and properly ordering the transformations. Maybe the transformations could happen in waves, starting with the images:

  1. All of the images get renamed based on their hashes
  2. The contents of the CSS files + all of the JS files get updated to include the new image names
  3. All of the CSS files + JS files get renamed based on their hashes
  4. The contents of the HTML files get updated to include the new image
  5. JS + CSS names

I can try implementing that locally and if/how badly things break.

— Reply to this email directly or view it on GitHub <

https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78369275

.

— Reply to this email directly or view it on GitHub <

https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78375417

.

— Reply to this email directly or view it on GitHub < https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78379240

.

— Reply to this email directly or view it on GitHub https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78379703 .

jeffposnick commented 9 years ago

The complication is that the new file name includes a hash which depends on the file's contents. So you can't rename a file and then update the file's contents afterwards; the hash wouldn't reflect the contents, and we wouldn't be able to guarantee that modified files would result in a fresh network fetch.

jakearchibald commented 9 years ago

Ohh I get it now. I guess the hash in the filename should be a hash of the file content plus all dependency content in some sort of deterministic order.

On Wed, 11 Mar 2015 20:14 Jeffrey Posnick notifications@github.com wrote:

The complication is that the new file name includes a hash which depends on the file's contents. So you can't rename a file and then update the file's contents afterwards; the hash wouldn't reflect the contents, and we wouldn't be able to guarantee that modified files would result in a fresh network fetch.

— Reply to this email directly or view it on GitHub https://github.com/GoogleChrome/ioweb2015/issues/640#issuecomment-78400897 .

stramel commented 8 years ago

Bumping for status update.

A concern that I had was:

  1. How using importHref with resolveUrl would work without knowing the exact filename.

Just another library that I had used previously was gulp-rev