Closed timvandermeij closed 2 weeks ago
Before this patch, on the current master
branch, we have the following situation:
$ npx gulp publish
<snip>
$ mv build/ build1/
$ npx gulp publish
<snip>
$ mv build/ build2/
$ diff -r build1/ build2/
Binary files build1/pdfjs-4.4.56-dist.zip and build2/pdfjs-4.4.56-dist.zip differ
Binary files build1/pdfjs-4.4.56-legacy-dist.zip and build2/pdfjs-4.4.56-legacy-dist.zip differ
$ echo $?
1
$ sha256sum build1/pdfjs-4.4.56-dist.zip build2/pdfjs-4.4.56-dist.zip
f95c27b43c4c4c804b946f025e727ee4c5ac6b627b940817b052947f046d556b build1/pdfjs-4.4.56-dist.zip
2cceb023db8a0cc61c74e9c7ef115afcaf858330e7c1a58ecca6c1367914678b build2/pdfjs-4.4.56-dist.zip
$ sha256sum build1/pdfjs-4.4.56-legacy-dist.zip build2/pdfjs-4.4.56-legacy-dist.zip
31831a0e2dd2d9dec477a927976fc0f3c6b5eaa6a628bffc4cb6e88f0fb10f2c build1/pdfjs-4.4.56-legacy-dist.zip
90b62ed45e29c3f2f73e2e8dd89b676a8d742e239306401feb531cc6de7e49bd build2/pdfjs-4.4.56-legacy-dist.zip
I have triggered two builds from the same source code, moved the output into separate folders, computed the SHA256 hash of the ZIP files and generated the diff. Note that the SHA256 hashes are different, showing that the ZIP files are not reproducible.
I have repeated this process with this patch applied below. Note that the SHA256 hashes are equal now and the diff is empty:
$ npx gulp publish
<snip>
$ mv build/ build1/
$ npx gulp publish
<snip>
$ mv build/ build2/
$ diff -r build1/ build2/
$ echo $?
0
$ sha256sum build1/pdfjs-4.4.57-dist.zip build2/pdfjs-4.4.57-dist.zip
5eedfd3b522b6e7b0e10d1a0e7b04bf7e2faf93b48dbf50e0b8ab24b20fe66a1 build1/pdfjs-4.4.57-dist.zip
5eedfd3b522b6e7b0e10d1a0e7b04bf7e2faf93b48dbf50e0b8ab24b20fe66a1 build2/pdfjs-4.4.57-dist.zip
$ sha256sum build1/pdfjs-4.4.57-legacy-dist.zip build2/pdfjs-4.4.57-legacy-dist.zip
ee3496e7d63bfca5dc045341e74e4220da92b30b8e03fdf97256d305db20cd14 build1/pdfjs-4.4.57-legacy-dist.zip
ee3496e7d63bfca5dc045341e74e4220da92b30b8e03fdf97256d305db20cd14 build2/pdfjs-4.4.57-legacy-dist.zip
The release builds are currently not reproducible because ZIP files record the modification date of files generated during the build process, meaning that two builds from identical source code, made at different times, result in different output.
This is undesirable because it makes detecting differences in the output harder, for instance recently during the Gulp 5 efforts, because the modification date differences are irrelevant and could obscure actually important differences in the output during e.g. code changes. Moreover, reprodicibility of build artifacts has become increasingly important; please refer to the Reproducible Builds initiative at https://reproducible-builds.org (note the "Why does it matter?" section specifically) and https://reproducible-builds.org/docs/timestamps which further explains the problem of timestamps in build artifacts.
This commit fixes the issue by configuring the ZIP file creation to use the (fixed) date of the last Git commit for which the release is being made. With this the build is fully reproducible so that identical source code builds result in bit-by-bit identical output artifacts.
To improve readability we convert the compression method to take a parameter object and use template strings where useful.