mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
47.14k stars 9.82k forks source link

Implement reproducibility for the JSDoc builds #18256

Closed timvandermeij closed 2 weeks ago

timvandermeij commented 2 weeks ago

The JSDoc builds are currently not reproducible because a timestamp is included in the output, meaning that two builds from identical source code, made at different times, result in different output.

This is undesirable because it makes diffing the output difficult, for instance recently during the Gulp 5 efforts, because the timestamp differences are irrelevant and could obscure actually important differences in the output during e.g. code changes. Moreover, reprodicibility of build artifacts has become increasingly important; please refer to the Reproducible Builds initiative at https://reproducible-builds.org (note the "Why does it matter?" section specifically) and https://reproducible-builds.org/docs/timestamps which further explains the problem of timestamps in build artifacts.

This commit fixes the issue by configuring JSDoc to not include the timestamps in the output. It's not relevant for end users and without it the build is fully reproducible so that identical source code builds result in bit-by-bit identical output artifacts.

Note that this option sadly can only be set via a configuration file, and not via the command line parameters like we used to have, so for consistency we also move the other options into the configuration file so they are all in one place and the Gulpfile becomes a bit simpler.

timvandermeij commented 2 weeks ago

Before this patch, on the current master branch, we have the following situation:

$ npx gulp jsdoc
[14:51:14] Using gulpfile ~/Documenten/Ontwikkeling/pdf.js/Code/gulpfile.mjs
[14:51:14] Starting 'jsdoc'...

### Generating documentation (JSDoc)
[14:51:15] Finished 'jsdoc' after 751 ms
$ mv build/ build1/
$ npx gulp jsdoc
[14:51:25] Using gulpfile ~/Documenten/Ontwikkeling/pdf.js/Code/gulpfile.mjs
[14:51:25] Starting 'jsdoc'...

### Generating documentation (JSDoc)
[14:51:26] Finished 'jsdoc' after 811 ms
$ mv build/ build2/
$ sha256sum <(find build1 -type f -exec sha256sum {} \; | sort | cut -d' ' -f1)
077e7540143194269fc78e2a079fce54a6e6fe8e815abeceadba87080842c971  /dev/fd/63
$ sha256sum <(find build2 -type f -exec sha256sum {} \; | sort | cut -d' ' -f1)
e6d3e0f46330e8894ee17a05e7c6a9ea059c1d952baf87f03df1cf7267cab9cd  /dev/fd/63
$ diff -r build1/ build2/
diff '--color=auto' -r build1/jsdoc/api.js.html build2/jsdoc/api.js.html
3492c3492
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/index.html build2/jsdoc/index.html
59c59
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib.html build2/jsdoc/module-pdfjsLib.html
5366c5366
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFDataRangeTransport.html build2/jsdoc/module-pdfjsLib-PDFDataRangeTransport.html
1496c1496
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFDocumentLoadingTask.html build2/jsdoc/module-pdfjsLib-PDFDocumentLoadingTask.html
650c650
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFDocumentProxy.html build2/jsdoc/module-pdfjsLib-PDFDocumentProxy.html
3764c3764
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFObjects.html build2/jsdoc/module-pdfjsLib-PDFObjects.html
884c884
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFPageProxy.html build2/jsdoc/module-pdfjsLib-PDFPageProxy.html
2470c2470
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-PDFWorker.html build2/jsdoc/module-pdfjsLib-PDFWorker.html
734c734
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
diff '--color=auto' -r build1/jsdoc/module-pdfjsLib-RenderTask.html build2/jsdoc/module-pdfjsLib-RenderTask.html
550c550
<     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:15 GMT+0200 (Midden-Europese zomertijd)
---
>     Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.3</a> on Sun Jun 16 2024 14:51:26 GMT+0200 (Midden-Europese zomertijd)
$ echo $?
1

I have triggered two builds from the same source code, moved the output into separate folders, computed the combined SHA256 hash of all files and generated the recursive diff. Note that the SHA256 hashes are different and the diff includes only timestamp changes.

I have repeated this process with this patch applied below. Note that the SHA256 hashes are equal now and the diff is empty:

$ npx gulp jsdoc
[14:52:33] Using gulpfile ~/Documenten/Ontwikkeling/pdf.js/Code/gulpfile.mjs
[14:52:33] Starting 'jsdoc'...

### Generating documentation (JSDoc)
[14:52:33] Finished 'jsdoc' after 740 ms
$ mv build/ build1/
$ npx gulp jsdoc
[14:52:40] Using gulpfile ~/Documenten/Ontwikkeling/pdf.js/Code/gulpfile.mjs
[14:52:40] Starting 'jsdoc'...

### Generating documentation (JSDoc)
[14:52:41] Finished 'jsdoc' after 779 ms
$ mv build/ build2/
$ sha256sum <(find build1 -type f -exec sha256sum {} \; | sort | cut -d' ' -f1)
aa8ab79a93b00e19dcd1ad17b6a00e562aa57fd285d845a8aa3fef0b625cf10a  /dev/fd/63
$ sha256sum <(find build2 -type f -exec sha256sum {} \; | sort | cut -d' ' -f1)
aa8ab79a93b00e19dcd1ad17b6a00e562aa57fd285d845a8aa3fef0b625cf10a  /dev/fd/63
$ diff -r build1/ build2/
$ echo $?
0
timvandermeij commented 2 weeks ago

/botio-linux preview

moz-tools-bot commented 2 weeks ago

From: Bot.io (Linux m4)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://54.241.84.105:8877/7c1af6f3258bc9f/output.txt

moz-tools-bot commented 2 weeks ago

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/7c1af6f3258bc9f/output.txt

Total script time: 0.98 mins

Published

timvandermeij commented 2 weeks ago

Note that the only observable difference in the API documentation is that http://54.241.84.105:8877/7c1af6f3258bc9f/api now only shows Documentation generated by JSDoc 4.0.3 instead of Documentation generated by JSDoc 4.0.3 on Sun Jun 16 2024 10:18:35 GMT+0000 (Coordinated Universal Time) that the current version at https://mozilla.github.io/pdf.js/api shows.