fabacab / git-archive-all.sh

A bash shell script wrapper for git-archive that archives a git superproject and its submodules, if it has any.
214 stars 86 forks source link

Archives created with --format tar.gz and --format zip aren't reproducible #31

Open tjanez opened 8 years ago

tjanez commented 8 years ago

I've created a simple Bash script reproducibility-test.sh to test the reproducibility of various --format options. I've also compared it to git's built-in archive command.

Here are sample results for a git repo with submodules:

[tadej@tlinux64 genesis]$ ./reproducibility-test.sh 
  adding: genjs.zip (stored 0%)
  adding: genjs.zip (stored 0%)
  adding: genjs.zip (stored 0%)
/tmp/reproducibility_test.WagXHM ~/Genialis/genesis
88baafe24913ccd6ec0945b1b3f4566b  git-archive-all-tar_only-1.tar
88baafe24913ccd6ec0945b1b3f4566b  git-archive-all-tar_only-2.tar
88baafe24913ccd6ec0945b1b3f4566b  git-archive-all-tar_only-3.tar
09f08547cfc2e70bfbae9d570583246e  git-archive-all-tar_with_gzip-1.tar.gz
71bdb10ddf853300f5a98bfe8eaa3b96  git-archive-all-tar_with_gzip-2.tar.gz
3e0d95d0f2dcc4385d5d6deec58b7ef4  git-archive-all-tar_with_gzip-3.tar.gz
06181cf449d7137817d84c3826e64b0b  git-archive-all-zip-1.zip
8681020a6fde7e750419b6e821ad2d9b  git-archive-all-zip-2.zip
c4b9933c33c0e9c4e7bac35cf23f7294  git-archive-all-zip-3.zip
97bafb747b169297f1ebf488dcc9ae5c  git-archive-tar_only-1.tar
97bafb747b169297f1ebf488dcc9ae5c  git-archive-tar_only-2.tar
97bafb747b169297f1ebf488dcc9ae5c  git-archive-tar_only-3.tar
d2c5692100208019da62f26ddb1719a0  git-archive-tar_with_gzip-1.tar.gz
d2c5692100208019da62f26ddb1719a0  git-archive-tar_with_gzip-2.tar.gz
d2c5692100208019da62f26ddb1719a0  git-archive-tar_with_gzip-3.tar.gz
7a1cb6286609909a50b162105ef13eee  git-archive-zip-1.zip
7a1cb6286609909a50b162105ef13eee  git-archive-zip-2.zip
7a1cb6286609909a50b162105ef13eee  git-archive-zip-3.zip
~/Genialis/genesis
[tadej@tlinux64 genesis]$

As can be seen from the results, git's built-in archive command always creates reproducible archives, regardless of the --format option.

On the contrary, git-archive-all.sh only creates reproducible archives with the --format tar option.

fabacab commented 8 years ago

Huh. What happens if you add the -n (--no-name) switch to the various calls to gzip? Per the gzip manual page:

-n, --no-name This option stops the filename and timestamp from being stored in the output file.

Could perhaps the timestamp added to the gzip file header be causing a discrepancy?

Mic92 commented 4 years ago

You can export GZIP=-n https://github.com/NixOS/nixpkgs/pull/86493/files#diff-f98fdb70c813cc44cd5600f4a059ee25R36 to avoid the behavior in gzip.

tjanez commented 4 years ago

@Mic92, thanks for the tip. I can confirm this works:

[tadej@toronto temp-repo ]$ bash reproducibility-test.sh 
  adding: deps.nanos-secure-sdk.zip (stored 0%)
  adding: deps.nanos-secure-sdk.zip (stored 0%)
  adding: deps.nanos-secure-sdk.zip (stored 0%)
/tmp/reproducibility_test.KCTWSH ~/Oasis/Code/temp-repo
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-1.tar
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-2.tar
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-3.tar
9e8ad93c340e461bac1221b56ab2f5fc  git-archive-all-tar_with_gzip-1.tar.gz
3871d7b6d7892c11cb4457ea3c6e92f6  git-archive-all-tar_with_gzip-2.tar.gz
f6f64056785b0992b4915280759cabc3  git-archive-all-tar_with_gzip-3.tar.gz
45ff56c8dc74bf87e4f40d33259df99d  git-archive-all-zip-1.zip
23e5dbf69ae9c82f781ec99057c24e12  git-archive-all-zip-2.zip
b3c4b14a9fc6c20ec2791fae61101acf  git-archive-all-zip-3.zip
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-1.tar
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-2.tar
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-3.tar
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-1.tar.gz
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-2.tar.gz
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-3.tar.gz
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-1.zip
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-2.zip
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-3.zip
~/Oasis/Code/temp-repo
[tadej@toronto temp-repo ]$ export GZIP=-n
[tadej@toronto temp-repo ]$ bash reproducibility-test.sh 
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
  adding: deps.nanos-secure-sdk.zip (stored 0%)
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
  adding: deps.nanos-secure-sdk.zip (stored 0%)
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
gzip: warning: GZIP environment variable is deprecated; use an alias or script
  adding: deps.nanos-secure-sdk.zip (stored 0%)
gzip: warning: GZIP environment variable is deprecated; use an alias or script
/tmp/reproducibility_test.Ag3Tyw ~/Oasis/Code/temp-repo
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-1.tar
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-2.tar
34427bca560a05c29d2e995f61ca6135  git-archive-all-tar_only-3.tar
6715c58bdb8155b0b9e4290e4d47a56b  git-archive-all-tar_with_gzip-1.tar.gz
6715c58bdb8155b0b9e4290e4d47a56b  git-archive-all-tar_with_gzip-2.tar.gz
6715c58bdb8155b0b9e4290e4d47a56b  git-archive-all-tar_with_gzip-3.tar.gz
c21a2a6e3caa4fcbc41e5fc33e3a3a4e  git-archive-all-zip-1.zip
8f6f6b4b78f5b84344d68c20d17d5c91  git-archive-all-zip-2.zip
69972e9349acf85479becd4e805e4549  git-archive-all-zip-3.zip
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-1.tar
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-2.tar
0ef4923f0362ce95a8488db78ccc9b39  git-archive-tar_only-3.tar
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-1.tar.gz
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-2.tar.gz
e6b1f0ccb3b5748b092be95f6243dd2d  git-archive-tar_with_gzip-3.tar.gz
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-1.zip
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-2.zip
0f71dceb03f97eab4cf1d347f5243030  git-archive-zip-3.zip
~/Oasis/Code/temp-repo

adding -n switch to gzip calls would make gzipped tarballs reproducible.

LecrisUT commented 1 year ago

I think this should just be added in by default. In the script