NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.6k stars 13.76k forks source link

python3: `ftplib` not reproducible #345329

Open raboof opened 3 days ago

raboof commented 3 days ago

Building this package multiple times does not yield bit-by-bit identical results, complicating the detection of Continuous Integration (CI) breaches. For more information on this issue, visit reproducible-builds.org.

Fixing bit-by-bit reproducibility also has additional advantages, such as avoiding hard-to-reproduce bugs, making content-addressed storage more effective and reducing rebuilds in such systems.

Steps To Reproduce

1. Build the package

This step will build the package. Specific arguments are passed to the command to keep the build artifacts so we can compare them in case of differences.

Execute the following command:

nix-build '<nixpkgs>' -A python3 && nix-build '<nixpkgs>' -A python3 --check --keep-failed

Or using the new command line style:

nix build nixpkgs#python3 && nix build nixpkgs#python3 --rebuild --keep-failed

2. Compare the build artifacts

If the previous command completes successfully, no differences were found and there's nothing to do, builds are reproducible. If it terminates with the error message error: derivation '<X>' may not be deterministic: output '<Y>' differs from '<Z>', use diffoscope to investigate the discrepancies between the two build outputs. You may need to add the --exclude-directory-metadata recursive option to ignore files and directories metadata (e.g. timestamp) differences.

nix run nixpkgs#diffoscopeMinimal -- --exclude-directory-metadata recursive <Y> <Z>

3. Examine the build log

To examine the build log, use:

nix-store --read-log $(nix-instantiate '<nixpkgs>' -A python3)

Or with the new command line style:

nix log $(nix path-info --derivation nixpkgs#python3)

Additional context

/nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check
--- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5
+++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check
│   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib
├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib
│ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12
│ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12
│ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/__pycache__
│ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/__pycache__
│ │ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/__pycache__/ftplib.cpython-312.opt-1.pyc
│ │ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/__pycache__/ftplib.cpython-312.opt-1.pyc
│ │ │ │ ├── Python bytecode
│ │ │ │ │ @@ -1,4 +1,4 @@
│ │ │ │ │  magic:    0xcb0d0d0a
│ │ │ │ │  moddate:  0xbfe7f8d8 (Tue May  8 20:08:31 2085 UTC)
│ │ │ │ │  files sz: 218327008
│ │ │ │ │ +code:     starts at offset 16 (size: 42664 bytes)
│ │ │ │ │ -code:     starts at offset 16 (size: 42649 bytes)
│ │ │ │ ├── stat {}
│ │ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │ │  
│ │ │ │ │ +  Size: 42680        Blocks: 88         IO Block: 4096   regular file
│ │ │ │ │ -  Size: 42665        Blocks: 88         IO Block: 4096   regular file
│ │ │ │ │  Device: 254,0    Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │ │  
│ │ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/__pycache__/ftplib.cpython-312.opt-2.pyc
│ │ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/__pycache__/ftplib.cpython-312.opt-2.pyc
│ │ │ │ ├── Python bytecode
│ │ │ │ │ @@ -1,4 +1,4 @@
│ │ │ │ │  magic:    0xcb0d0d0a
│ │ │ │ │  moddate:  0xbfe7f8d8 (Tue May  8 20:08:31 2085 UTC)
│ │ │ │ │  files sz: 218327008
│ │ │ │ │ +code:     starts at offset 16 (size: 32530 bytes)
│ │ │ │ │ -code:     starts at offset 16 (size: 32515 bytes)
│ │ │ │ ├── stat {}
│ │ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │ │  
│ │ │ │ │ +  Size: 32546        Blocks: 64         IO Block: 4096   regular file
│ │ │ │ │ -  Size: 32531        Blocks: 64         IO Block: 4096   regular file
│ │ │ │ │  Device: 254,0    Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │ │  
│ │ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/__pycache__/ftplib.cpython-312.pyc
│ │ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/__pycache__/ftplib.cpython-312.pyc
│ │ │ │ ├── Python bytecode
│ │ │ │ │ @@ -1,4 +1,4 @@
│ │ │ │ │  magic:    0xcb0d0d0a
│ │ │ │ │  moddate:  0xbfe7f8d8 (Tue May  8 20:08:31 2085 UTC)
│ │ │ │ │  files sz: 218327008
│ │ │ │ │ +code:     starts at offset 16 (size: 42664 bytes)
│ │ │ │ │ -code:     starts at offset 16 (size: 42649 bytes)
│ │ │ │ ├── stat {}
│ │ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │ │  
│ │ │ │ │ +  Size: 42680        Blocks: 88         IO Block: 4096   regular file
│ │ │ │ │ -  Size: 42665        Blocks: 88         IO Block: 4096   regular file
│ │ │ │ │  Device: 254,0    Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │ │  
│ │ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/idlelib
│ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/idlelib
│ │ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/idlelib/idle_test
│ │ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/idlelib/idle_test
│ │ │ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/idlelib/idle_test/__pycache__
│ │ │ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/idlelib/idle_test/__pycache__
│ │ │ │ │ ├── stat {}
│ │ │ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │ │ │  
│ │ │ │ │ │ +  Size: 20480      Blocks: 40         IO Block: 4096   directory
│ │ │ │ │ │ -  Size: 16384      Blocks: 32         IO Block: 4096   directory
│ │ │ │ │ │  Device: 254,0  Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │ │ │  
│ │ │ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │ │ │ ├── stat {}
│ │ │ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │ │ │  
│ │ │ │ │ │ -  Size: 16384      Blocks: 32         IO Block: 4096   directory
│ │ │ │ │ │ +  Size: 20480      Blocks: 40         IO Block: 4096   directory
│ │ │ │ │ │  Device: 254,0  Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │ │ │  
│ │ │ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │   --- /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5/lib/python3.12/test
│ │ ├── +++ /nix/store/h3i0acpmr8mrjx07519xxmidv8mpax4y-python3-3.12.5.check/lib/python3.12/test
│ │ │ ├── stat {}
│ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │  
│ │ │ │ +  Size: 20480      Blocks: 40         IO Block: 4096   directory
│ │ │ │ -  Size: 4096       Blocks: 8          IO Block: 4096   directory
│ │ │ │  Device: 254,0  Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │  
│ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000
│ │ │ ├── stat {}
│ │ │ │ @@ -1,7 +1,7 @@
│ │ │ │  
│ │ │ │ -  Size: 4096       Blocks: 8          IO Block: 4096   directory
│ │ │ │ +  Size: 20480      Blocks: 40         IO Block: 4096   directory
│ │ │ │  Device: 254,0  Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
│ │ │ │  
│ │ │ │  Modify: 1970-01-01 00:00:01.000000000 +0000

Add a :+1: reaction to issues you find important.

raboof commented 2 days ago

I suspect the directories that only differ in their stats are a red herring, and it's really the differences in ftplib.cpython-312.pyc, ftplib.cpython-312.opt-1.pyc and ftplib.cpython-312.opt-2.pyc that are the issue. There's plenty of .pyc files that reproduce successfully.

It's somewhat confusing that the cpython3/default.nix has a reproducibleBuild flag that is set to false, but in fact the package has been reproducible: for example python3-3.11.9 in 24.05 (at 9603a116b8d554f) seems to be reproducible just fine. Looking at the history it sounds like generating default, unoptimized bytecode used to be nondeterministic (https://github.com/python/cpython/issues/73894), but it is not clear if that's still a problem in Python 3.11 and later.

The problem with ftplib on python312 does not seem new - it already existed at 9603a116b8d554f