symfony / webpack-encore

A simple but powerful API for processing & compiling assets built around Webpack
https://symfony.com/doc/current/frontend.html
MIT License
2.23k stars 197 forks source link

Enhance assets versionning #1266

Open lyrixx opened 5 months ago

lyrixx commented 5 months ago

Hello, I just realized something about assets versioning.

Abstract

ATM, when we enable asset versioning, we have URL like this:

https://examples.com/assets/app.d233edc1.js

https://github.com/symfony/webpack-encore/blob/6b75c4dd963c613ddc19fa2748e0b6bcc4512c16/lib/config-generator.js#L206

And this is not optimal.

Instead we should have built URL like this

https://examples.com/assets/app.js?v=d233edc1

Why should we use query string instead of filename

Let's consider the following scenario

If we have used the query string, the file would have been in the release => no problem! 😎

You could say we could store assets in varnish, but caching assets is usually pointless. It's not faster than with nginx, and just consume more RAM for nothing.

How to workaround

ATM, we could workaround with the following configuration:

# webpack.config.js
Encore
  // we disable the default versioning to use query params, as when we deploy a new version, the old one is not available for sure
  // @see https://symfony.com/doc/current/frontend/encore/versioning.html#asset-versioning-and-deployment
  // .enableVersioning(Encore.isProduction())
  .configureFilenames({
    js: '[name].js?[chunkhash]',
    css: '[name].css?[contenthash:8]',
    assets: 'assets/[name][ext]?[hash:8]',
  })

How to fix that.

I think we could fix that easily:

Instead of custom code for webpackConfig.useVersioning, we could remove everything, and call configureFilenames() with a sane default config in WebpackConfig.js

WDYT?

stof commented 5 months ago

Using the query string is not bullet-proof either: if you have your origin servers behind load balancers and you restart the origin servers in a rolling fashion during deployment, you might have a page served by the first server being restarted asking to reference app.js?v=2 (which will be a cache miss in the CDN cache) but this request being processed by another origin server which still has the old app.js. In such case, you end up having a CDN cache for app.js?v=2 containing the code of the version 1 (and it won't ever be cache-busted to fix it as it corrupted the cache busting).

This is not a theoretical case. This exact case caused a production incident at Incenteev a few weeks ago, the first time our Heroku load balancer had 3 origin servers instead of 2 at a time we deployed a CSS change (with only 2 servers, the way Heroku works prevents this issue because the second server gets stopped as soon as the first server is restarted so we never have a chance to start processing a request with an old server after serving a page with the new one. But for 3+ servers, this has a chance to happen).

stof commented 5 months ago

btw, the cache version of Page A in your example might be broken if used with v2 of the JS code instead of v1 (as this JS code is different by definition of how the versioning is applied)

lyrixx commented 5 months ago

btw, the cache version of Page A in your example might be broken if used with v2 of the JS code instead of v1 (as this JS code is different by definition of how the versioning is applied)

Indeed! But I think it's better to have "a potential CSS/JS issue" than "no CSS/JS at all".

More over, @damienalexandre told my that it occurs also without varnish during deploy. (I let him share a screenshot next time the problem occurs)

stof commented 5 months ago

@lyrixx as I had a CDN, this is not a potential CSS/JS issue during deployment though. It is such an issue until the asset cache expires (and this can take 1 year if you use long-caching). And even without a CDN, it could still happen in the browser cache (which we cannot purge, unlike the Cloudflare cache).

Kocal commented 5 months ago

This is something we use at work for the purpose you describe, expect that we use ?v=[contenthash:8] (but we should give a try to [chunkhash]), and it works "well". But you must ensure that your CDN is nicely configure to not ignore the query string (e.g. with Cloudflare).

To me the best option here is to keep hash in file (app.d233edc1.js) and cache it for the maximum period.

stof commented 5 months ago

@Kocal the whole point is that we don't want to make Cloudflare ignore the query string. We want the cache to treat app.js?v=1 and app.js?v=2 as different entries (otherwise, cache busting would not work)

Kocal commented 5 months ago

Sorry I've forgot a word

jderusse commented 5 months ago

I deploy the last N versions of assets to avoid this issue.

The APP is able to serve assets for both users running the old DOM or new DOM.

GromNaN commented 5 months ago

The current mechanisms is better because it ensures you have the correct file matching your url.

I used to deploy all the assets on S3, keeping the files (and eventually removing old versions Afters months).

smnandre commented 5 months ago

The problem with the query string is that you cannot keep two versions of the same assets in parralel .. but this could be required (example on which i learned this the hard way: images linked in email)

But i think there may be some improvment in documentation / best practices regarding the asset conservation

lyrixx commented 4 months ago

Okay, there is no consensus here. So maybe it's only a documentation problem.

smnandre commented 4 months ago

I think there is, on the issue you raise : users with default symfony app / no CDN are probably deleting their previous assets on every deploy.

Maybe asset:deploy should have some "keep N previous versions" options (for importmap and for encore) instead of erasing all