ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.09k stars 19.94k forks source link

Reproducible builds #28987

Open holiman opened 6 months ago

holiman commented 6 months ago

Reproducible builds

This is a little investigation into "do we have reproducible builds in geth?".

A reproducible build means that one can replicate locally a build made on e.g. a build-server. That is, produce an exact matching binary. This is very useful to verify the integrity of the build-servers: any remote machine can be used to watch over the builds.

The Go compiler is, supposedly, reproducible. However, go-ethereum is not pure go

Testing

First, I downloaded the latest build from our downloads-page. The downloads-page lists the checksum as 8d5e138dc3eb7b08cde48966aee0ea79 (note: md5 is not a secure cryptographic hash, but we also provide detached signatures, which offers much better security in verifying integrity).

[user@work go-ethereum]$ md5sum geth-linux-amd64-1.13.13-unstable-fe91d476.tar.gz
8d5e138dc3eb7b08cde48966aee0ea79  geth-linux-amd64-1.13.13-unstable-fe91d476.tar.gz
[user@work go-ethereum]$ md5sum geth-linux-amd64-1.13.13-unstable-fe91d476/geth
1a372833c2a63c95a2f855524eb5fcd9  geth-linux-amd64-1.13.13-unstable-fe91d476/geth

I then tried to create a docker container replicating the enviromment used. Details gleaned from the downloaded file:

$ ./geth-linux-amd64-1.13.13-unstable-fe91d476/geth version
Geth
Version: 1.13.13-unstable
Git Commit: fe91d476ba3e29316b6dc99b6efd4a571481d888
Git Commit Date: 20240213
Architecture: amd64
Go Version: go1.21.6
Operating System: linux
GOPATH=/home/user/go
GOROOT=/usr/local/go

The .travis.yml also gives us some hints:

      dist: bionic
      go: 1.21.x

Dockerfile attempt

Using a dockerfile like this:

from ubuntu:bionic

RUN apt-get update && apt-get install gcc-multilib git ca-certificates wget -yq --no-install-recommends
RUN git clone --branch master https://github.com/ethereum/go-ethereum.git

RUN wget https://go.dev/dl/go1.21.6.linux-amd64.tar.gz && \
    rm -rf /usr/local/go && \
    tar -C /usr/local -xzf go1.21.6.linux-amd64.tar.gz && \
    export PATH=$PATH:/usr/local/go/bin 

RUN cd go-ethereum && git checkout fe91d476ba3e29316b6dc99b6efd4a571481d888 && \
    CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install ./cmd/geth/
RUN md5sum ./build/bin/geth 

In order to make the docker-version bundle the git data, we set the TRAVIS,CI env variables. See internal/build/env.go for reasons.


The two builds are not exactly alike in size:

root@208cb9fcfa68:/go-ethereum# ls -l ./build/bin/geth         
-rwxr-xr-x 1 root root 58129760 Feb 14 09:53 ./build/bin/geth
$ ls -la ./geth-linux-amd64-1.13.13-unstable-fe91d476/geth
-rwxr-xr-x 1 user user 58129968 Feb 13 14:55 ./geth-linux-amd64-1.13.13-unstable-fe91d476/geth

Content-wise:

root@208cb9fcfa68:/go-ethereum# strings ./build/bin/geth | head    
/lib64/ld-linux-x86-64.so.2
RAMLiBUAnrbn5zHLQ2v2/WlYmiboMK5ddsyu5qL-z/zajlwZgTLCfStG3HorG6/Utx6Jmui4qlzsokyGBwE
D %$
DD@ 
#@ $
@@  j
k(dB0
0    b
ljI^
q6-p

VS

$ strings ./geth-linux-amd64-1.13.13-unstable-fe91d476/geth | head
/lib64/ld-linux-x86-64.so.2
JHKPXlVR27nUe4y9sY68/WlYmiboMK5ddsyu5qL-z/zajlwZgTLCfStG3HorG6/QrO6sKmnFVHR7U-WHF3U
x3vo
D %$
DD@
#@ $
@@  j
k(dB0
0    b
ljI^
holiman commented 6 months ago

Actually, ignore reproducing the same build as the travis builder, we don't even reproduce the same build on the same system:

root@208cb9fcfa68:/go-ethereum# rm ./build/bin/geth 

root@208cb9fcfa68:/go-ethereum# CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install -dlgo ./cmd/geth
gotool.go:96: -dlgo version matches active Go version 1.21.6, skipping download.
>>> /usr/local/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=fe91d476ba3e29316b6dc99b6efd4a571481d888 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240213 -extldflags '-Wl,-z,stack-size=0x800000'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth

root@208cb9fcfa68:/go-ethereum# md5sum ./build/bin/geth
1337ffaed216a31fa9a77caf138f642f  ./build/bin/geth

root@208cb9fcfa68:/go-ethereum# rm ./build/bin/geth 

root@208cb9fcfa68:/go-ethereum# CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install -dlgo ./cmd/geth
gotool.go:96: -dlgo version matches active Go version 1.21.6, skipping download.
>>> /usr/local/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=fe91d476ba3e29316b6dc99b6efd4a571481d888 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240213 -extldflags '-Wl,-z,stack-size=0x800000'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth

root@208cb9fcfa68:/go-ethereum# md5sum ./build/bin/geth
4e5180c9678db91d506e223c9a25838a  ./build/bin/geth
holiman commented 6 months ago

If we disable the C building, then we get reliable builds on a single machine

root@208cb9fcfa68:/go-ethereum# rm ./build/bin/geth; CGO_ENABLED=0 CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install -dlgo ./cmd/geth; md5sum ./build/bin/geth
gotool.go:96: -dlgo version matches active Go version 1.21.6, skipping download.
>>> /usr/local/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=fe91d476ba3e29316b6dc99b6efd4a571481d888 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240213 -extldflags '-Wl,-z,stack-size=0x800000'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth
9f99056d1537a6f00704e25cc77e8a3f  ./build/bin/geth

root@208cb9fcfa68:/go-ethereum# rm ./build/bin/geth; CGO_ENABLED=0 CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install -dlgo ./cmd/geth; md5sum ./build/bin/geth
gotool.go:96: -dlgo version matches active Go version 1.21.6, skipping download.
>>> /usr/local/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=fe91d476ba3e29316b6dc99b6efd4a571481d888 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240213 -extldflags '-Wl,-z,stack-size=0x800000'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth
9f99056d1537a6f00704e25cc77e8a3f  ./build/bin/geth
holiman commented 4 months ago

Got a report that these paths are present in the output:

│ -/home/travis/gopath/pkg/mod/github.com/karalabe/usb@v0.0.2/libusb/libusb/os/linux_usbfs.c
[user@work hid]$ go build ./demo.go && strings demo | grep home
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/libusbi.h
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/linux_usbfs.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/events_posix.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/linux_netlink.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/core.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/hotplug.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/io.c
/home/user/go/src/github.com/karalabe/hid/wchar.go
/home/user/go/src/github.com/karalabe/hid/hid_enabled.go
/home/user/go/src/github.com/karalabe/hid/demo.go

[user@work hid]$ go build  -ldflags="-w -s" ./demo.go && strings demo | grep home
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/libusbi.h
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/linux_usbfs.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/events_posix.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/os/linux_netlink.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/core.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/hotplug.c
/home/user/go/src/github.com/karalabe/hid/libusb/libusb/io.c
/home/user/go/src/github.com/karalabe/hid/wchar.go
/home/user/go/src/github.com/karalabe/hid/hid_enabled.go
/home/user/go/src/github.com/karalabe/hid/demo.go

[user@work hid]$ go build -trimpath ./demo.go && strings demo | grep home
[user@work hid]$

This works when imported as a library too

[user@work go-ethereum]$ go build ./cmd/geth  &&  strings ./geth | grep "home/user" | head -n 5
/home/user/go/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/libusbi.h
/home/user/go/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/linux_usbfs.c
/home/user/go/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/events_posix.c
/home/user/go/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/linux_netlink.c
/home/user/go/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/core.c
[user@work go-ethereum]$ go build -trimpath ./cmd/geth  &&  strings ./geth | grep "home/user" | head -n 5
[user@work go-ethereum]$ 

I don't see these paths in the output binary

[user@work go-ethereum]$ CI=true TRAVIS=true TRAVIS_COMMIT="fe91d476ba3e29316b6dc99b6efd4a571481d888" go run ./build/ci.go install -dlgo ./cmd/geth &&  strings ./build/bin/geth | grep "home/user" | head -n 5
/home/user/.cache/go1.22.2.linux-amd64.tar.gz is up-to-date
>>> /home/user/.cache/geth-go-1.22.2-linux-amd64/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=fe91d476ba3e29316b6dc99b6efd4a571481d888 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240213 -extldflags '-Wl,-z,stack-size=0x800000'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /home/user/go/src/github.com/ethereum/go-ethereum/build/bin/geth ./cmd/geth
vivi365 commented 4 months ago

Hi,

Running this:

wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.13.15-c5ba367e.tar.gz
tar -xvf geth-linux-amd64-1.13.15-c5ba367e.tar.gz
cd geth-linux-amd64-1.13.15-c5ba367e
grep -a 'home/travis' geth | strings

I get four occurrences of full Travis paths in the bundle:

/home/travis/gopath/pkg/mod/github.com/karalabe/usb@v0.0.2/libusb/libusb/os/linux_netlink.c
/home/travis/gopath/pkg/mod/github.com/karalabe/usb@v0.0.2/libusb/libusb/os/linux_usbfs.c
/home/travis/gopath/pkg/mod/github.com/karalabe/usb@v0.0.2/libusb/libusb/io.c
/home/travis/gopath/pkg/mod/github.com/ethereum/c-kzg-4844@v0.4.0/bindings/go/../../src/c_kzg_4844.c

Which are part of the read-only data. readelf -p .rodata geth | grep 'travis'

-trimpath seems to work so perhaps there is something else going on -- looking into it.

Here are some files to reproduce more descriptive diffs using diffoscope.

holiman commented 4 months ago

Right. And here's how it looks against a newer binary (1.14.0)

/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/libusbi.h
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/events_posix.c
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/linux_netlink.c
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/os/linux_usbfs.c
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/core.c
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/hotplug.c
/home/travis/gopath/pkg/mod/github.com/karalabe/hid@v1.0.1-0.20240306101548-573246063e52/libusb/libusb/io.c
/home/travis/gopath/pkg/mod/github.com/ethereum/c-kzg-4844@v1.0.0/bindings/go/../../src/c_kzg_4844.c
holiman commented 4 months ago

@vivi365 made a great finding here: https://github.com/golang/go/issues/67011, trimpath is broken in ubuntu bionic.

Following that example, I did the same (but with hid, to reduce the build time)

First dockerfile, bionic, 18.04, which is an ESM

FROM ubuntu:bionic
RUN apt-get update && apt-get install gcc-multilib git ca-certificates wget -yq --no-install-recommends
RUN git clone --branch master --depth 1 https://github.com/karalabe/hid
RUN wget https://go.dev/dl/go1.21.6.linux-amd64.tar.gz && \
    rm -rf /usr/local/go && \
    tar -C /usr/local -xzf go1.21.6.linux-amd64.tar.gz && \
    export PATH=$PATH:/usr/local/go/bin

RUN cd hid && CGO_ENABLED=1 /usr/local/go/bin/go build -trimpath ./demo.go
RUN mv /hid/demo /demo && readelf -p .rodata demo | tee 1.txt | grep /hid/libusb | tee 2.txt

results in

#10 0.139   [ 3a745]  /hid/libusb/libusb/libusbi.h
#10 0.139   [ 3a890]  /hid/libusb/libusb/core.c
#10 0.139   [ 3aade]  /hid/libusb/libusb/hotplug.c
#10 0.139   [ 3ab5c]  /hid/libusb/libusb/io.c
#10 0.139   [ 3b9b8]  /hid/libusb/libusb/os/events_posix.c
#10 0.139   [ 3b9e0]  /hid/libusb/libusb/os/linux_netlink.c
#10 0.139   [ 3ba08]  /hid/libusb/libusb/os/linux_usbfs.c

For

#9 0.470   [ 3b950]  /_/github.com/karalabe/hid/libusb/libusb/libusbi.h
#9 0.470   [ 3b988]  /_/github.com/karalabe/hid/libusb/libusb/os/events_posix.c
#9 0.470   [ 3b9c8]  /_/github.com/karalabe/hid/libusb/libusb/os/linux_netlink.c
#9 0.470   [ 3ba08]  /_/github.com/karalabe/hid/libusb/libusb/os/linux_usbfs.c
#9 0.470   [ 3ba48]  /_/github.com/karalabe/hid/libusb/libusb/core.c
#9 0.470   [ 3baa8]  /_/github.com/karalabe/hid/libusb/libusb/hotplug.c
#9 0.470   [ 3bae0]  /_/github.com/karalabe/hid/libusb/libusb/io.c

This is good, now it stripped the path /hid/ and all we see is the package-internal paths.

So seems that particular bug is only present in ubuntu. We should bump the CI-builders.

vivi365 commented 3 months ago

Thanks for bumping to noble, it seems to have fixed the trimpath issue!

And a heads up; the dist being used might be Ubuntu focal, not noble.

wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.14.3-ab48ba42.tar.gz
tar -xvf geth-linux-amd64-1.14.3-ab48ba42.tar.gz && cd geth-linux-amd64-1.14.3-ab48ba42
readelf -p .comment geth
String dump of section '.comment':
  [     0]  GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
cat /etc/os-release

NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Might be as they officially do not support noble (yet) https://docs.travis-ci.com/user/reference/linux/.

holiman commented 2 weeks ago

To follow up a bit. I tried building with docker focal, and using the lastest commit (but still remaining on master, so as to not wind up on a detached head), and also specify --build-id=none (https://stackoverflow.com/a/15316448). Build command:

/usr/local/go/bin/go build -ldflags "-X github.com/ethereum/go-ethereum/internal/version.gitCommit=710c3f32ac8e4e5829a6a631dcfb1e0e13a49220 -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240816 -extldflags '-Wl,-z,stack-size=0x800000,--build-id=none'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth/

Now, if I build twice in a row, as geth.3 and geth.4, then do

$ readelf -a ./geth.3 > geth3.txt
$ readelf -a ./geth.4 > geth4.txt
root@73feee364289:/go-ethereum/build/bin# diff geth3.txt geth4.txt
67942c67942
<  66184: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS /tmp/go-link-3247583145/0
---
>  66184: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS /tmp/go-link-2258969311/0
70176c70176
<    description data: 30 36 68 6d 42 68 41 59 5a 32 49 63 75 54 66 6a 63 6f 2d 45 2f 59 7a 65 50 32 4e 63 30 74 32 47 49 4f 39 4e 56 52 30 66 70 2f 45 69 5f 62 38 64 73 66 5a 75 6c 56 31 4d 70 65 6c 53 64 61 2f 45 6d 34 34 77 4f 34 64 65 42 6e 71 57 4c 44 59 73 42 69 59 
---
>    description data: 30 36 68 6d 42 68 41 59 5a 32 49 63 75 54 66 6a 63 6f 2d 45 2f 59 7a 65 50 32 4e 63 30 74 32 47 49 4f 39 4e 56 52 30 66 70 2f 45 69 5f 62 38 64 73 66 5a 75 6c 56 31 4d 70 65 6c 53 64 61 2f 76 65 2d 4b 77 51 6f 75 47 7a 57 46 32 79 67 33 77 6d 6d 66 
root@73feee364289:/go-ethereum/build/bin# 

Looking into more details, there are two causes for the diffs: The first has to do with trusted setup of kzg, somehow including a temporary path:

*** 67937,67947 ****
   66179: 00000000015d25d0   945 FUNC    LOCAL  DEFAULT   14 compute_kzg_proof_impl
   66180: 00000000020774a0    17 OBJECT  LOCAL  DEFAULT   16 __PRETTY_FUNCTION__.4839
   66181: 00000000015d3710   721 FUNC    LOCAL  DEFAULT   14 load_trusted_setup.part.0
   66182: 0000000002077500  1024 OBJECT  LOCAL  DEFAULT   16 SCALE2_ROOT_OF_UNITY
   66183: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS _cgo_export.c
!  66184: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS /tmp/go-link-3247583145/0
   66185: 00000000015edb40     0 OBJECT  LOCAL  DEFAULT   14 K256
   66186: 00000000015eef84     0 NOTYPE  LOCAL  DEFAULT   14 ct_inverse_mod_383$1
   66187: 00000000015f0260   207 FUNC    LOCAL  DEFAULT   14 __ab_approximation_31
   66188: 00000000015eff80   451 FUNC    LOCAL  DEFAULT   14 __smulx_383_n_shift_by_31
   66189: 00000000015efe20   335 FUNC    LOCAL  DEFAULT   14 __smulx_383x63
--- 67937,67947 ----
   66179: 00000000015d25d0   945 FUNC    LOCAL  DEFAULT   14 compute_kzg_proof_impl
   66180: 00000000020774a0    17 OBJECT  LOCAL  DEFAULT   16 __PRETTY_FUNCTION__.4839
   66181: 00000000015d3710   721 FUNC    LOCAL  DEFAULT   14 load_trusted_setup.part.0
   66182: 0000000002077500  1024 OBJECT  LOCAL  DEFAULT   16 SCALE2_ROOT_OF_UNITY
   66183: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS _cgo_export.c
!  66184: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS /tmp/go-link-2258969311/0
   66185: 00000000015edb40     0 OBJECT  LOCAL  DEFAULT   14 K256
   66186: 00000000015eef84     0 NOTYPE  LOCAL  DEFAULT   14 ct_inverse_mod_383$1
   66187: 00000000015f0260   207 FUNC    LOCAL  DEFAULT   14 __ab_approximation_31
   66188: 00000000015eff80   451 FUNC    LOCAL  DEFAULT   14 __smulx_383_n_shift_by_31
   66189: 00000000015efe20   335 FUNC    LOCAL  DEFAULT   14 __smulx_383x63
***************

The second is some sort of description data related to buildid.

*** 70171,70176 ****
      OS: Linux, ABI: 3.2.0

  Displaying notes found in: .note.go.buildid
    Owner                Data size  Description
    Go                   0x00000053 Unknown note type: (0x00000004)
!    description data: 30 36 68 6d 42 68 41 59 5a 32 49 63 75 54 66 6a 63 6f 2d 45 2f 59 7a 65 50 32 4e 63 30 74 32 47 49 4f 39 4e 56 52 30 66 70 2f 45 69 5f 62 38 64 73 66 5a 75 6c 56 31 4d 70 65 6c 53 64 61 2f 45 6d 34 34 77 4f 34 64 65 42 6e 71 57 4c 44 59 73 42 69 59 
--- 70171,70176 ----
      OS: Linux, ABI: 3.2.0

  Displaying notes found in: .note.go.buildid
    Owner                Data size  Description
    Go                   0x00000053 Unknown note type: (0x00000004)
!    description data: 30 36 68 6d 42 68 41 59 5a 32 49 63 75 54 66 6a 63 6f 2d 45 2f 59 7a 65 50 32 4e 63 30 74 32 47 49 4f 39 4e 56 52 30 66 70 2f 45 69 5f 62 38 64 73 66 5a 75 6c 56 31 4d 70 65 6c 53 64 61 2f 76 65 2d 4b 77 51 6f 75 47 7a 57 46 32 79 67 33 77 6d 6d 66 

The upper hexdata is 06hmBhAYZ2IcuTfjco-E/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/Em44wO4deBnqWLDYsBiY, the lower is 06hmBhAYZ2IcuTfjco-E/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/ve-KwQouGzWF2yg3wmmf


So, I made four builds. The first two do not have --buildid=none, but the last two does.

root@73feee364289:/go-ethereum# /usr/local/go/bin/go tool buildid ./build/bin/geth.1 
nv8PTd3agd4XFW4vnUJa/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/-SCF5VYWCOmgSZpqkCnE
root@73feee364289:/go-ethereum# /usr/local/go/bin/go tool buildid ./build/bin/geth.2 
AZOD1_AyAlWO8tgfSiDv/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/H4XpoRz0cdajIW7r1bdN
root@73feee364289:/go-ethereum# /usr/local/go/bin/go tool buildid ./build/bin/geth.3
06hmBhAYZ2IcuTfjco-E/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/Em44wO4deBnqWLDYsBiY
root@73feee364289:/go-ethereum# /usr/local/go/bin/go tool buildid ./build/bin/geth.4
06hmBhAYZ2IcuTfjco-E/YzeP2Nc0t2GIO9NVR0fp/Ei_b8dsfZulV1MpelSda/ve-KwQouGzWF2yg3wmmf

And the last two build-ids are almost identical, differing only in the last (fourth) part.

holiman commented 2 weeks ago

More docs about the buildid: https://go.dev/src/cmd/go/internal/work/buildid.go

The precise form is

    actionID/[.../]contentID
holiman commented 2 weeks ago

This was a tough one! But after searching around, I found that I can get around the spurious path by adding the linker directive --strip-all, winding up with -extldflags '-Wl,-z,stack-size=0x800000,--build-id=none,--strip-all'".

It shaves off 20 MB of data too, but it probably makes all stack traces suck (it might be that it only makes the C-side stack traces suck, I'm not sure).

root@73feee364289:/go-ethereum# ls -la ./build/bin/ -h
total 518M
drwxr-xr-x 2 root root 4.0K Aug 19 09:12 .
drwxr-xr-x 1 root root 4.0K Aug 19 07:16 ..
-rwxr-xr-x 1 root root  61M Aug 19 07:16 geth.1
-rwxr-xr-x 1 root root  61M Aug 19 07:19 geth.2
-rwxr-xr-x 1 root root  61M Aug 19 07:20 geth.3
-rwxr-xr-x 1 root root  61M Aug 19 07:21 geth.4
-rwxr-xr-x 1 root root  61M Aug 19 07:58 geth.5
-rwxr-xr-x 1 root root  61M Aug 19 07:59 geth.6
-rwxr-xr-x 1 root root  43M Aug 19 09:05 geth.7
-rwxr-xr-x 1 root root  43M Aug 19 09:07 geth.8
-rwxr-xr-x 1 root root  43M Aug 19 09:12 geth.9
-rw-r--r-- 1 root root 5.7M Aug 19 07:43 geth1.txt
-rw-r--r-- 1 root root 5.7M Aug 19 07:43 geth2.txt
-rw-r--r-- 1 root root 5.7M Aug 19 07:26 geth3.txt
-rw-r--r-- 1 root root 5.7M Aug 19 07:26 geth4.txt
-rw-r--r-- 1 root root 5.7M Aug 19 09:08 geth6.txt
-rw-r--r-- 1 root root 164K Aug 19 09:08 geth7.txt
-rw-r--r-- 1 root root 164K Aug 19 09:08 geth8.txt
-rw-r--r-- 1 root root 164K Aug 19 09:12 geth9.txt
root@73feee364289:/go-ethereum# md5sum ./build/bin/geth8.txt 
0ffe4441e63e6a402656d6eea0a6b983  ./build/bin/geth8.txt
root@73feee364289:/go-ethereum# md5sum ./build/bin/geth9.txt 
0ffe4441e63e6a402656d6eea0a6b983  ./build/bin/geth9.txt

(7, 8 and 9 are using --strip-all, 8 and 9 are also using --build-id=none).

holiman commented 2 weeks ago

I tested a ctrl-c:ing it to yield a stack trace, and the (go-side) stacks do indeed appear to be readable. So using --strip-all seems to be a good way forward!

^CINFO [08-19|09:15:18.613] Got interrupt, shutting down...
INFO [08-19|09:15:18.613] HTTP server stopped                      endpoint=127.0.0.1:8551
INFO [08-19|09:15:18.614] IPC endpoint closed                      url=/root/.ethereum/geth.ipc
INFO [08-19|09:15:18.614] Ethereum protocol stopped
INFO [08-19|09:15:18.614] Transaction pool stopped
INFO [08-19|09:15:18.663] Persisting dirty state to disk           root=d7f897..0f0544 layers=0
INFO [08-19|09:15:18.668] Persisted dirty state to disk            size=69.00B elapsed=5.181ms
INFO [08-19|09:15:18.682] Blockchain stopped
^CWARN [08-19|09:15:19.276] Already shutting down, interrupt more to panic. times=9
^CWARN [08-19|09:15:19.321] Already shutting down, interrupt more to panic. times=8
^CWARN [08-19|09:15:19.360] Already shutting down, interrupt more to panic. times=7
^CWARN [08-19|09:15:19.403] Already shutting down, interrupt more to panic. times=6
^CWARN [08-19|09:15:19.454] Already shutting down, interrupt more to panic. times=5
^CWARN [08-19|09:15:19.488] Already shutting down, interrupt more to panic. times=4
^CWARN [08-19|09:15:19.528] Already shutting down, interrupt more to panic. times=3
^CWARN [08-19|09:15:19.582] Already shutting down, interrupt more to panic. times=2
^CWARN [08-19|09:15:19.612] Already shutting down, interrupt more to panic. times=1
^Cpanic: boom

goroutine 4323 [running]:
github.com/ethereum/go-ethereum/internal/debug.LoudPanic(...)
    github.com/ethereum/go-ethereum/internal/debug/loudpanic.go:24
github.com/ethereum/go-ethereum/cmd/utils.StartNode.func1.1()
    github.com/ethereum/go-ethereum/cmd/utils/cmd.go:106 +0x15c
github.com/ethereum/go-ethereum/cmd/utils.StartNode.func1()
    github.com/ethereum/go-ethereum/cmd/utils/cmd.go:121 +0x2b2
created by github.com/ethereum/go-ethereum/cmd/utils.StartNode in goroutine 1
    github.com/ethereum/go-ethereum/cmd/utils/cmd.go:81 +0xbb
holiman commented 1 week ago

So close now... Everything matches, except for the actionId part of the build id. (which is a hash of the inputs to the action that produced the packages or binary)

Comparing a locally built (docker on a debian) with the downloaded binary:

user@debian-work:~/workspace/reproducible$ go tool buildid ./host/geth 
JqxlVGe6y5HEZJuwpzng/Ox960Em4qOFG2qWqf-Nj/a1eh6QOr0GbndRDtt2ue/lHJXq44hcBT4fMnqfooK
user@debian-work:~/workspace/reproducible$ go tool buildid ./geth-linux-amd64-1.14.9-unstable-30824faf/geth
_7qhZtHulg0Ahj3TbJHQ/Ox960Em4qOFG2qWqf-Nj/a1eh6QOr0GbndRDtt2ue/lHJXq44hcBT4fMnqfooK
holiman commented 1 week ago

Downloading the travis-built binary:

user@debian-work:~/workspace/reproducible$ wget https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz
--2024-08-23 09:55:37--  https://gethstore.blob.core.windows.net/builds/geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz
Resolving gethstore.blob.core.windows.net (gethstore.blob.core.windows.net)... 20.60.40.164
Connecting to gethstore.blob.core.windows.net (gethstore.blob.core.windows.net)|20.60.40.164|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16632223 (16M) [application/octet-stream]
Saving to: ‘geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz’

geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz       100%[===========================================================================================================================>]  15.86M  8.63MB/s    in 1.8s

2024-08-23 09:55:39 (8.63 MB/s) - ‘geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz’ saved [16632223/16632223]

user@debian-work:~/workspace/reproducible$ tar -xzf geth-linux-amd64-1.14.9-unstable-ada20c09.tar.gz

user@debian-work:~/workspace/reproducible$ md5sum ./geth-linux-amd64-1.14.9-unstable-ada20c09/geth
35c3d295b8d99159196b7b0c5e8f4bb6  ./geth-linux-amd64-1.14.9-unstable-ada20c09/geth

Winding up with 35c3d295b8d99159196b7b0c5e8f4bb6.

Now, using this Dockerfile

from ubuntu:focal

RUN apt-get update && apt-get install gcc-multilib git ca-certificates wget -yq --no-install-recommends

RUN wget https://go.dev/dl/go1.23.0.linux-amd64.tar.gz && \
    rm -rf /usr/local/go && \
    tar -C /usr/local -xzf go1.23.0.linux-amd64.tar.gz && \
    export PATH=$PATH:/usr/local/go/bin

RUN git clone --branch  master https://github.com/ethereum/go-ethereum.git

ADD build.sh /

With build.sh looking like below, I build from source and copy it to the host:

commit="ada20c09dcc73149769f8c578f53c8dd71c47a2c"
export PATH=$PATH:/usr/local/go/bin
cd go-ethereum && \
    git checkout $commit && \
    CI=true TRAVIS=true TRAVIS_COMMIT=$commit go run ./build/ci.go install ./cmd/geth/ && \
    cd ..

md5sum ./go-ethereum/build/bin/geth
readelf -a ./go-ethereum/build/bin/geth > repro.elf
cp ./go-ethereum/build/bin/geth /host/
cp repro.elf /host/

Run the docker, mount the /host/ folder so it's accessible, run the bash-script...

Lo and behold:

user@debian-work:~/workspace/reproducible$ md5sum ./geth-linux-amd64-1.14.9-unstable-ada20c09/geth
35c3d295b8d99159196b7b0c5e8f4bb6  ./geth-linux-amd64-1.14.9-unstable-ada20c09/geth
user@debian-work:~/workspace/reproducible$ md5sum ./host/geth
35c3d295b8d99159196b7b0c5e8f4bb6  ./host/geth

I think we're done! Thanks @vivi365

Next step would be to set up a secondary system which performs builds, and sounds an alarm if the builds fail to reproduce.

fjl commented 1 week ago

Great result! Do you think we could make it reproduce without TRAVIS=true? That should not make a difference.

holiman commented 1 week ago

Do you think we could make it reproduce without TRAVIS=true?

You are right, go run ./build/ci.go install ./cmd/geth/ is sufficient:

root@25eea4e82bf9:/go-ethereum# cd go-ethereum && rm ./build/bin/geth; /usr/local/go/bin/go run ./build/ci.go install ./cmd/geth/ && md5sum ./build/bin/geth
bash: cd: go-ethereum: No such file or directory
>>> /usr/local/go/bin/go build -ldflags "--buildid=none -X github.com/ethereum/go-ethereum/internal/version.gitCommit=ada20c09dcc73149769f8c578f53c8dd71c47a2c -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240823 -extldflags '-Wl,-z,stack-size=0x800000,--build-id=none,--strip-all'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth/
35c3d295b8d99159196b7b0c5e8f4bb6  ./build/bin/geth
holiman commented 1 week ago
command result expected
go run ./build/ci.go install ./cmd/geth/ 35c3d295b8d99159196b7b0c5e8f4bb6 35c3d295b8d99159196b7b0c5e8f4bb6
go run build/ci.go install -dlgo -arch 386 2cbb4ad8ee367cfb95903ce42907ea63 be86b88d62ba2f41721b7ace4510b37c

For the 386, the readelf -a outputs are identical, and the file sizes are identical.

Ah, it seems there's still something not quite right -- I thought this was fixes by https://github.com/ethereum/go-ethereum/pull/30325 . Comparing the xxc-dumps of the binaries:

user@debian-work:~/workspace/reproducible$ diff 386.have.dump 386.want.dump 
1795454,1795456c1795454,1795456
< 01b657d0: 0976 6373 2e6d 6f64 6966 6965 643d 6661  .vcs.modified=fa
< 01b657e0: 6c73 650a f932 4331 8618 2072 0082 4210  lse..2C1.. r..B.
< 01b657f0: 4116 d8f2 0000 0000 0000 0000 0000 0000  A...............
---
> 01b657d0: 0976 6373 2e6d 6f64 6966 6965 643d 7472  .vcs.modified=tr
> 01b657e0: 7565 0af9 3243 3186 1820 7200 8242 1041  ue..2C1.. r..B.A
> 01b657f0: 16d8 f200 0000 0000 0000 0000 0000 0000  ................
2540575c2540575
< 026c41e0: 20b1 ba09 d426 0000 00fc b909 7900 0000   ....&......y...
---
> 026c41e0: 20b1 ba09 d326 0000 00fc b909 7900 0000   ....&......y...
2552303c2552303
< 026f1ee0: 0867 6f31 2e32 332e 30d4 4d30 77af 0c92  .go1.23.0.M0w...
---
> 026f1ee0: 0867 6f31 2e32 332e 30d3 4d30 77af 0c92  .go1.23.0.M0w...
2552923,2552924c2552923,2552924
< 026f45a0: 6d6f 6469 6669 6564 3d66 616c 7365 0af9  modified=false..
< 026f45b0: 3243 3186 1820 7200 8242 1041 16d8 f200  2C1.. r..B.A....
---
> 026f45a0: 6d6f 6469 6669 6564 3d74 7275 650a f932  modified=true..2
> 026f45b0: 4331 8618 2072 0082 4210 4116 d8f2 0000  C1.. r..B.A.....
holiman commented 1 week ago

Make it unclean

root@25eea4e82bf9:/go-ethereum# touch fooo
root@25eea4e82bf9:/go-ethereum# git status . --porcelain
?? fooo
root@25eea4e82bf9:/go-ethereum# /usr/local/go/bin/go run build/ci.go install -dlgo -arch 386 ./cmd/geth && md5sum ./build/bin/geth
gotool.go:96: -dlgo version matches active Go version 1.23.0, skipping download.
>>> /usr/local/go/bin/go build -ldflags "--buildid=none -X github.com/ethereum/go-ethereum/internal/version.gitCommit=ada20c09dcc73149769f8c578f53c8dd71c47a2c -X github.com/ethereum/go-ethereum/internal/version.gitDate=20240823 -extldflags '-Wl,-z,stack-size=0x800000,--build-id=none,--strip-all'" -tags urfave_cli_no_docs,ckzg -trimpath -v -o /go-ethereum/build/bin/geth ./cmd/geth
github.com/ethereum/go-ethereum/cmd/geth
be86b88d62ba2f41721b7ace4510b37c  ./build/bin/geth

Bingo. Question is why the travis-build detects the vcs as modified

holiman commented 1 week ago

That did it

user@debian-work:~/workspace/reproducible$ bash fetch.sh 2>/dev/null
e3ccf8c2b20d1eb9f29b520854cfff2c  ./geth-linux-386-1.14.9-unstable-1d006bd5/geth
2808d46bb015b24e51962bcb57f23e66  ./geth-linux-amd64-1.14.9-unstable-1d006bd5/geth
user@debian-work:~/workspace/reproducible$ docker run -it  -v /home/user/workspace/reproducible/host/:/host holiman/repro
root@6834bc1a5424:/# bash build.sh 
...
amd64
2808d46bb015b24e51962bcb57f23e66  ./build/bin/geth
...
386
e3ccf8c2b20d1eb9f29b520854cfff2c  ./build/bin/geth
holiman commented 1 week ago

The rest (linux) also checks out now. The want:

4d1ac1d76f8415bfcc6577e061ce6fdd  ./geth-linux-arm5-1.14.9-unstable-1d006bd5/geth
fd85afcf9a1e7a38db02827a87e1f656  ./geth-linux-arm6-1.14.9-unstable-1d006bd5/geth
4579b82124b98dd67fd139ee223a90e4  ./geth-linux-arm64-1.14.9-unstable-1d006bd5/geth
a834a7bb127a8f60b3577661f1d7163e  ./geth-linux-arm7-1.14.9-unstable-1d006bd5/geth

And the "have"

arm5
4d1ac1d76f8415bfcc6577e061ce6fdd  ./build/bin/geth
arm6
fd85afcf9a1e7a38db02827a87e1f656  ./build/bin/geth
arm64
4579b82124b98dd67fd139ee223a90e4  ./build/bin/geth
arm7
a834a7bb127a8f60b3577661f1d7163e  ./build/bin/geth
holiman commented 1 week ago

Hm, oddly, though, sometimes when I build, the 386 one is off on one byte, b2 instead of e0.

user@debian-work:~/workspace/reproducible$ diff want.hex have.hex 
1162731c1162731
< 011bdea0: 83ec 0c68 e000 0000 e8c3 46e5 fe83 c410  ...h......F.....
---
> 011bdea0: 83ec 0c68 b200 0000 e8c3 46e5 fe83 c410  ...h......F.....

user@debian-work:~/workspace/reproducible$ md5sum ./geth-linux-386-1.14.9-unstable-1d006bd5/geth 
e3ccf8c2b20d1eb9f29b520854cfff2c  ./geth-linux-386-1.14.9-unstable-1d006bd5/geth
user@debian-work:~/workspace/reproducible$ md5sum ./host/geth 
73c9bda6ce4f4b421b9c51cdf57f7212  ./host/geth

readelf output is identical. I used ghidriff to diff them, report here: https://gist.github.com/holiman/cc3c7e35926b7aaef0bd17ffce806547 . Doesn't really tell me a whole lot.

Ah, ok, the problem occurs if I install the arm-gcc stuff to early. This "ruins" the regular 386-builder:

RUN apt-get -yq --no-install-suggests --no-install-recommends install gcc-arm-linux-gnueabi libc6-dev-armel-cross gcc-arm-linux-gnueabihf libc6-dev-armhf-cross gcc-aarch64-linux-gnu libc6-dev-arm64-cross
RUN ln -s /usr/include/asm-generic /usr/include/asm