Automattic / jetpack

Security, performance, marketing, and design tools — Jetpack is made by WordPress experts to make WP sites safer and faster, and help you grow your traffic.
https://jetpack.com/
Other
1.58k stars 797 forks source link

WP.com block editor: Revisit the date-based cache-busters [5] #12254

Open mmtr opened 5 years ago

mmtr commented 5 years ago

https://github.com/Automattic/jetpack/pull/12070#discussion_r277117371

In order to ensure that Jetpack users download the latest version of the WP.com block editor scripts, we use a cache-buster that changes every day (https://github.com/Automattic/jetpack/pull/12245).

This will make sure that any change we make to the scripts, will reach to the Jetpack users within 24 hours. This is fine for the initial launch period, if we discover more edge issues.

But once the code reaches a stable point people would be needlessly re-downloading the script.

Let's revisit this approach once we have some stability and switch to a hash-based approach that will avoid downloading the same script when the requested version bumps even though the source scripts have not.

noahtallen commented 3 years ago

circling back to this (lol)

This week a version of the script was deployed which contained a bug. It was only noticed after the cache rolled over to the next day. So we reverted the faulty code and deployed it. But then people were confused why it wasn't fixed, because it took another 24 hours for the fix to be available to all users because of the cache.

So yeah, we definitely need to change this to something which invalidates automatically when the file is updated. 🤔

We really need to be able to deploy a fix quickly, and to notice any errors happening quickly, so I think something like the hash-based approached would be way better.

I'm curious though, how does this query param help the browser (or CDN???) cache the file in any way?

noahtallen commented 3 years ago

cc @dmsnell @kwight how exactly would the hash-based approach work? Would the front end create a hash of the file and send that to the backend, which would also know the hash of the file?

dmsnell commented 3 years ago

@noahtallen it's been a while since I've been close to that but I don't believe I was speaking to a method that would have helped in this case. My concerns as I remember them had more to do with avoiding invalidating the cache every day when the code isn't changing - not with invalidating it more quickly when the code has.

In general we can think about strategies in cases like this: how major was this breakdown? how frequently do we expect to get into similar situations? how much of a detriment to the normal every-day deployment are we willing to suffer in order to prevent another breakdown of this size?

For example, one kind of brute-force approach is to post the hash of the current code during the build and deployment then have JavaScript request that hash/URL when the Jetpack app loads. That would ensure the most immediate code on every view at the obvious latency cost at load.

You might be able to get away with this too using a ServiceWorker to load the last-known hash whenever the page loads but then side-load the updates based on that newly-requested hash - it would be there the next time the page loads.

noahtallen commented 3 years ago

Thanks for the feedback as always!

how major was this breakdown

The breakdown was unfortunately major. It caused the block editor to stop working correctly for all Atomic users editing in Gutenframe. Many actions resulted in crashes or freezes in the editor. Publishing/updating posts was also not possible. :/

Though we were able to quickly find the culprit and revert the problematic code, we continued getting reports of the error for the next 24 hours because of cache issues.

I think the ultimate solution was that devops busted some caches in the CDN. So I'm wondering if we perhaps need to codify a way for us to quickly bust those caches when the file updates.

In my opinion, it makes sense for the cache to invalidate only when the file has been updated on the backend. The file is rarely updated these days (maybe once or twice a month) anyways. By the way, is the cache strategy in the backend, or on the Jetpack side of things?

If Jetpack itself is caching a copy of the file and not trying to get a new one for 24 hours, then I'm not sure what a solution would be. But if it's just the CDN caching the file (which appears to be the case), I wonder why we need to rely on any date-based stuff in Jetpack anyways. Couldn't Jetpack just ask the CDN for "xyz.js", and then the CDN handles any caching based on its own heuristics?

dmsnell commented 3 years ago

I can't fully remember where the caching takes place but I do seem to remember that some of it is in the PHP code, which is why this was a thing. I'd suggest trying a few proposals and compare their tradeoffs. The caching was setup at a time where code churn was high and not getting the code updates had a high risk of breaking the experience. Things are obviously different now.

kwight commented 3 years ago

Couldn't Jetpack just ask the CDN for "xyz.js", and then the CDN handles any caching based on its own heuristics?

I think the point of the caching is more to avoid the request in the first place: if a script is already cached in the browser, that's better than making a request, and getting back the same thing already stored locally (whether from a CDN or not). But you're right, without better invalidation in this now low-code-churn state, we have the risks of what you just described. 🙃

Please do ping @Automattic/serenity for review if you get something up, and we're happy to help work through it 👍

simison commented 3 years ago

@noahtallen would this fit in your work's focus on tooling?

noahtallen commented 3 years ago

Possibly, but not in the short term. I'm planning on looking at wpcom-block-editor a bit more in the coming weeks, but unsure how this will fit.