embroider-build / ember-auto-import

Zero config import from npm packages
Other
360 stars 109 forks source link

Chunks changing content when updating Webpack? #519

Open njoyard opened 2 years ago

njoyard commented 2 years ago

Hi,

This is not really a bug report, but instead general questioning about how stuff works.

We have an app using ember-auto-import 2.4.1, and we recently had a production outage when deploying a webpack bump from 5.68.0 to 5.72.0. Insofar as we understand it as of this writing, the root cause was probably some cache layer not properly serving updated assets after deployment.

What happened is, after the webpack update, one of the ember-auto-import chunks changed contents but kept the same name. The only changes in the chunk are just reordering of object entries, eg:

1298c1298
< n.d(t,{_I:function(){return l},yW:function(){return c}})
---
> n.d(t,{yW:function(){return c},_I:function(){return l}})

The lack of filename change, combined with the very likely cache issue mentioned above and with our use of ember-cli-sri, meant that some users where served the updated index.html but the old chunk content (thus failing SRI).

We originally expected the filename would change, mistaking what is actually a chunkhash for a fingerprint. Tell me if I understood that part correctly, but it looks to me the chunkhash is just a hash of what the chunk contains in terms of modules and versions, while the fingerprint would be a hash of the actual final content of the file.

We're currently assessing steps to take in order to avoid future occurrences of this issue (one of them being, of course, understand all our cache layers and fix them...), and we would like to maybe enable fingerprinting on those chunks. Is there some way of achieving that by configuring ember-auto-import? We're considering using a different chunk filename template, switching to contenthash or fullhash for example, but are those properly computed from the final content of the minified file?

Thanks for your insight on this :)

ef4 commented 2 years ago

Yikes, that is certainly annoying behavior from webpack.

Our defaults already use chunkhash. As far as I know,fullhash is a single hash over the whole application, it's not per-chunk, so it's probably not what you want.

Maybe what we need is optimize.realContentHash. If you can reproduce your problem by redoing the webpack version change, and try it with that setting, that might give confidence that it would have prevented your production issue.

We could make it the default for production builds. Presumably webpack doesn't turn it on by default because it makes builds slower, but for production builds we should be emphasizing correctness over speed.

njoyard commented 2 years ago

Thanks, I'll try that and report back.

njoyard commented 2 years ago

I can reproduce the problem without issues (this is with the default ember-auto-import config).

$ git checkout before-webpack-update
$ rm -R dist && yarn && yarn build -prod
$ sha256sum dist/assets/chunk.441.*.js
b607f349edb20a30cceb1e0842755912c177ab2363b80ab9f24f638c752595b8  dist/assets/chunk.441.8eeb9d7576bd5b038345.js

$ git checkout webpack-update
$ rm -R dist && yarn && yarn build -prod
$ sha256sum dist/assets/chunk.441.*.js
457f2e5fd7c94a372391b545ea023aa83e8e1ba07da8754ac7229433cf7177d1  dist/assets/chunk.441.8eeb9d7576bd5b038345.js

Same chunkhash, different content. Running a diff shows a few changes in object entries ordering like the one in my original post.

And the problem indeed disappears when using the following config:

  autoImport: {
    webpack: {
      output: {
        filename: 'chunk.[id].[contenthash].js'
      },
      optimization: {
        realContentHash: true
      }
    }
  }
davidtaylorhq commented 2 years ago
output: {
 filename: 'chunk.[id].[contenthash].js'
},

I tried this out in #548 to resolve the opposite problem - I was seeing identical chunk content, but the chunkHash was changing each time. Using contenthash did indeed make the .js filenames deterministic, but it revealed a separate problem.

The same filename template and contenthash is used for sourcemaps. So if you do something which changes the .map without changing the .js file, you'll end up with duplicate-named .map files with different content.

I think the module-naming improvements being discussed in #478 and #479 might help to improve the consistency of the [chunkhash], which appears to change based on both the sourcemap and the output.