docker-library / busybox

Docker Official Image packaging for Busybox
http://busybox.net
388 stars 126 forks source link

Adjust tarball creation to be reproducible #188

Closed tianon closed 5 months ago

tianon commented 7 months ago

Oh https://manpages.debian.org/buster/tar/tar.1.en.html#:~:text=on%20each%20checkpoint.-,%2D%2Dclamp%2Dmtime,-Only%20set%20time

That's probably better / more maintainable than my hacky script.

tianon commented 7 months ago

https://github.com/docker-library/busybox/actions/runs/7576748973/job/20636236993?pr=188#step:5:4479 :eyes:

a2a69ef5efbb70454c06535193e330ec9b50f7f6e3b67fe1a69d78e50a40db06 is the same latest/musl/busybox.tar.xz checksum I got locally :smile:

tianon commented 7 months ago

Nice, the simpler approach using tar --mtime ... --clamp-mtime also works and generates the same checksum: https://github.com/docker-library/busybox/actions/runs/7576892648/job/20636653568?pr=188#step:5:3622

tianon commented 7 months ago

I guess this is good by itself, but my end goal is actually to go all the way to oci-import (because a reproducible tar is otherwise not super interesting here given how much Docker/BuildKit munge it on the round-trip through ADD tar /).

tianon commented 6 months ago

Here's an example library/busybox file generated from this change (hacked to use our fork):

# this file is generated via https://github.com/infosiftr/busybox/blob/5aade3a1527f3dddc69ea149d040768941b34664/generate-stackbrew-library.sh

Maintainers: Tianon Gravi <admwiggin@gmail.com> (@tianon),
             Joseph Ferguson <yosifkit@gmail.com> (@yosifkit)
GitRepo: https://github.com/infosiftr/busybox.git
GitCommit: 5aade3a1527f3dddc69ea149d040768941b34664
Builder: oci-import
File: index.json
# https://github.com/infosiftr/busybox/tree/dist-amd64
amd64-GitFetch: refs/heads/dist-amd64
amd64-GitCommit: 668d52e6f0596e0fd0b1be1d8267c4b9240dc2b3
# https://github.com/infosiftr/busybox/tree/dist-arm32v6
arm32v6-GitFetch: refs/heads/dist-arm32v6
arm32v6-GitCommit: c479d660005ac7073e97509575668b794cdbc5f5

Tags: 1.36.1-glibc, 1.36-glibc, 1-glibc, stable-glibc, glibc
Architectures: amd64
amd64-Directory: latest/glibc/amd64

Tags: 1.36.1-uclibc, 1.36-uclibc, 1-uclibc, stable-uclibc, uclibc
Architectures: amd64
amd64-Directory: latest/uclibc/amd64

Tags: 1.36.1-musl, 1.36-musl, 1-musl, stable-musl, musl
Architectures: amd64, arm32v6
amd64-Directory: latest/musl/amd64
arm32v6-Directory: latest/musl/arm32v6

Tags: 1.36.1, 1.36, 1, stable, latest
Architectures: amd64, arm32v6
amd64-Directory: latest/glibc/amd64
arm32v6-Directory: latest/musl/arm32v6

Tags: 1.35.0-glibc, 1.35-glibc
Architectures: amd64
amd64-Directory: latest-1/glibc/amd64

Tags: 1.35.0-uclibc, 1.35-uclibc
Architectures: amd64
amd64-Directory: latest-1/uclibc/amd64

Tags: 1.35.0-musl, 1.35-musl
Architectures: amd64, arm32v6
amd64-Directory: latest-1/musl/amd64
arm32v6-Directory: latest-1/musl/arm32v6

Tags: 1.35.0, 1.35
Architectures: amd64, arm32v6
amd64-Directory: latest-1/glibc/amd64
arm32v6-Directory: latest-1/musl/arm32v6
tianon commented 6 months ago

Notably, this actually commits all but the tarballs directly in the master branch of this repository. The theory here is that with reproducible tarballs, that becomes a lot more interesting (we can meaningfully track their change over time, for example).

The main concern I would've had with this is having multiple build jobs all trying to push to the same branch (and resolving merge conflicts between them as we rebase / re-push over and over), but with each architecture using a dedicated directory, that should be mostly reasonable (and these don't run on an automated trigger either, so there's not really a very high chance of changes in the way things build happening while a build is in progress).

tianon commented 6 months ago

It's going to be a bit more complex to fix our explicit debian:unstable builds to work correctly now that I'm using docker buildx build explicitly in build.sh, so I need to spend a bit more time thinking about the best way to inject that correctly. :thinking:

(probably just a sed instead of a pull+tag, TBH)

tianon commented 6 months ago

This also now makes our CI verify the reproducibility. :eyes:

tianon commented 6 months ago

(So it's written down somewhere explicitly: one end goal of this is to have something with lower stakes / less DOI child images than Ubuntu to test https://github.com/docker-library/meta-scripts/pull/20 against :eyes:)

tianon commented 5 months ago

https://github.com/docker-library/busybox/commit/e177f73eeaf1688b87c31e2989e46c552c64480b (meta-arm32v6) :eyes:

https://github.com/docker-library/busybox/commit/830066c7869fcd55cb4a9e9b36d10e571a04ab51 (dist-arm32v6) :eyes: