golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.54k stars 17.6k forks source link

cmd/go: Build information embedded by Go 1.18 impairs build reproducibility with cgo flags #52372

Closed jefferyto closed 2 years ago

jefferyto commented 2 years ago

What version of Go are you using (go version)?

$ go version
go version go1.18.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Not relevant

What did you do?

Reproducible builds is something OpenWrt, and many other Linux distributions, would like to achieve. We use -trimpath to remove build host-specific paths from binaries built by Go.

We also set -I options in CGO_CPPFLAGS, and -L options in -ldflags (passed to -extldflags) and CGO_LDFLAGS.

What did you expect to see?

A way to either sanitize the embedded build information or omit it entirely.

What did you see instead?

There are build host-specific paths in CGO_CPPFLAGS (the -I options), -ldflags and CGO_LDFLAGS (the -L options). There is also a build host-specific path as part of -ffile-prefix-map in CGO_CFLAGS and CGO_CXXFLAGS ~(I believe this is set as a result of -trimpath)~.

go version -m Output
$ go version -m obfs4proxy
obfs4proxy: go1.18.1
        path    gitlab.com/yawning/obfs4.git/obfs4proxy
        mod     gitlab.com/yawning/obfs4.git    (devel)
        dep     filippo.io/edwards25519 v1.0.0-rc.1.0.20210721174708-390f27c3be20       h1:iJoUgXvhagsNMrJrvavw7vu1eG8+hm6jLOxlLFcoODw=
        dep     git.torproject.org/pluggable-transports/goptlib.git     v1.0.0  h1:ElTwFFPKf/tA6x5nuIk9g49JZzS4T5WN+eTQTjqd00A=
        dep     github.com/dchest/siphash       v1.2.1  h1:4cLinnzVJDKxTCl9B01807Yiy+W7ZzVHj/KIroQRvT4=
        dep     gitlab.com/yawning/edwards25519-extra.git       v0.0.0-20211229043746-2f91fcc9fbdb      h1:qRSZHsODmAP5qDvb3YsO7Qnf3TRiVbGxNG/WYnlM4/o=
        dep     golang.org/x/crypto     v0.0.0-20210711020723-a769d52b0f97      h1:/UOmuWzQfxxo9UtlXMwuQU8CMgg1eZXqTRwkSQJWKOI=
        dep     golang.org/x/net        v0.0.0-20210226172049-e18ecbb05110      h1:qWPm9rbaAMKs8Bq/9LRpbMqxWRVUAQwMI9fVrssnTfw=
        build   -compiler=gc
        build   -ldflags="all=-buildid '1649548598' -linkmode external -extldflags '-L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/lib -L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/lib -Wl,-z,now -Wl,-z,relro'"
        build   CGO_ENABLED=1
        build   CGO_CFLAGS="-Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -ffile-prefix-map=/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/obfs4-obfs4proxy-0.0.13=obfs4-obfs4proxy-0.0.13 -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro"
        build   CGO_CPPFLAGS="-I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/include -I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/include/fortify -I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/include"
        build   CGO_CXXFLAGS="-Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -ffile-prefix-map=/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/obfs4-obfs4proxy-0.0.13=obfs4-obfs4proxy-0.0.13 -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro"
        build   CGO_LDFLAGS="-L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/lib -L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/lib -znow -zrelro"
        build   GOARCH=arm
        build   GOOS=linux
        build   GOARM=7
seankhliao commented 2 years ago

what was the env / command used to build the binary?

jefferyto commented 2 years ago

what was the env / command used to build the binary?

This is roughly the command (split into multiple lines):

GOOS="linux" \
GOARCH="arm" \
GO386="" \
GOAMD64="" \
GOARM="7" \
GOMIPS="" \
GOMIPS64="" \
GOPPC64="" \
CGO_ENABLED=1 \
CC="arm-openwrt-linux-muslgnueabi-gcc" \
CXX="arm-openwrt-linux-muslgnueabi-g++" \
CGO_CFLAGS="-Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -ffile-prefix-map=/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/obfs4-obfs4proxy-0.0.13=obfs4-obfs4proxy-0.0.13 -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro" \
CGO_CPPFLAGS="-I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/include -I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/include/fortify -I/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/include" \
CGO_CXXFLAGS="-Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -ffile-prefix-map=/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/obfs4-obfs4proxy-0.0.13=obfs4-obfs4proxy-0.0.13 -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro" \
CGO_LDFLAGS="-L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/lib -L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/lib -znow -zrelro" \
GOPATH="/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/obfs4-obfs4proxy-0.0.13/.go_work/build" \
GOCACHE="/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/tmp/go-build" \
GOMODCACHE="/media/jeff/Jekyll/Downloads/openwrt/dl/go-mod-cache" \
GOENV=off \
go install \
-modcacherw \
-v \
-buildvcs=false \
-trimpath \
-ldflags "all=-buildid '1649548598' -linkmode external -extldflags '-L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/usr/lib -L/media/jeff/Jekyll/Downloads/openwrt/testing/armvirt-32/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-11.2.0_musl_eabi/lib -Wl,-z,now -Wl,-z,relro'" \
-installsuffix "v7" \
gitlab.com/yawning/obfs4.git/obfs4proxy

Regarding -ffile-prefix-map, it's actually added by the OpenWrt build system. And I had a (slightly different) duplicate -ldflags in the earlier build; I will update the original post with the correct go version -m output.

jefferyto commented 2 years ago

Minimal example:

$ cat hello.go
package main

import "fmt"

func main() {
    fmt.Printf("hello, world\n")
}
$ CGO_CPPFLAGS=-I/foo/bar go build hello.go
$ go version -m hello
hello: go1.18.1
        path    command-line-arguments
        build   -compiler=gc
        build   CGO_ENABLED=1
        build   CGO_CFLAGS=
        build   CGO_CPPFLAGS=-I/foo/bar
        build   CGO_CXXFLAGS=
        build   CGO_LDFLAGS=
        build   GOARCH=amd64
        build   GOOS=linux
        build   GOAMD64=v1
mpx commented 2 years ago

I don't think it would make sense to modify CGO flags before storing them since the toolchain would end up storing an incorrect representation of the build. It would be better to avoid storing the flags altogether.

Making reproducible binaries with CGO is fundamentally tricky/fraught, but there are some some existing workarounds:

  1. Build via a specified reproducible environment (eg, Docker, specific OS version/build,..)
  2. Disable buildvcs (-buildvcs=false).
  3. Disable CGO to avoid storing CGO flags (CGO_ENABLED=0)

Potential toolchain modifications:

  1. Add an option to skip storing non-reproducible buildvcs details? Yet another flag, which would be 1 of many to create a reproducible bulid. Or,
  2. Add a more generic -reproducible flag which makes a best effort to configure the toolchain.

Both (4) and (5) risk disappointing users since some things are outside the control of the toolchain and it may not be possible to reproduce the original build. Fundamentally, builds using CGO will require the same or a very similar environment to reproduce binaries (library versions, build options,...). This effectively requires (1) to guarantee reproducibility with CGO, or (3) to remove CGO from the equation.

Perhaps the best option is for users desiring reproducible builds to use -buildvcs=false? This is effectively what earlier Go versions provided - so no additional loss.

Reproducible builds are hard but desirable. Perhaps the answer is documentation? It would help if the project maintained docs higlhighting best practices.

Cc @bcmills @matloob

jefferyto commented 2 years ago

Making reproducible binaries with CGO is fundamentally tricky/fraught

AFAIK our (OpenWrt) CGO binaries had been reproducible before 1.18.

Disable buildvcs (-buildvcs=false).

This only removes VCS information, not build information. (See #50501 for some background.)

Disable CGO to avoid storing CGO flags (CGO_ENABLED=0)

-ldflags is also captured in the build information, not just CGO flags. (You can see this in the go version -m output in the original post.)

Perhaps the best option is for users desiring reproducible builds to use -buildvcs=false? This is effectively what earlier Go versions provided - so no additional loss.

As it stands currently, using -buildvcs=false is not equivalent to what earlier Go versions provided.

mpx commented 2 years ago

Making reproducible binaries with CGO is fundamentally tricky/fraught

AFAIK our (OpenWrt) CGO binaries had been reproducible before 1.18.

Yes, I was referring to reproducible builds in general - there are many things that can break them. Reproducible builds are possible, but environments need to be tightly controlled (I agree this is now harder than before).

For reference, Restic is an example of a project that provides specific instructions on how to reproduce their builds (including specific paths).

It would be useful for the toolchain to make this easier where reasonably practical. I expect this will be a somewhat never-ending process with the inherent complexity in C/C++ environments (beyond the regression here).

-ldflags is also captured in the build information, not just CGO flags.

If CGO is disabled the -ldflags tag can be much simpler and more reproducible. I know that CGO_ENABLED=0 isn't practical in many cases so this isn't a general solution.

As it stands currently, using -buildvcs=false is not equivalent to what earlier Go versions provided.

Ah, thanks. I misremembered about -buildinfo. The original argument for removing -buildinfo was that it did not result in differences between builds -- that doesn't seem to be the case due to -ldflags. I agree this would be a good regression to fix.

bcmills commented 2 years ago

Add a more generic -reproducible flag which makes a best effort to configure the toolchain.

That is essentially what -trimpath is supposed to do. I think the solution for now is probably to omit the CGO_ settings when -trimpath is used.

Perhaps we can do something more nuanced in the future — like stamping only the non-path portions of those flags — but that would require that we know exactly which flags may contain paths. (And that seems like a whole can of worms!)

The point of stamping the cgo settings is to allow someone with a cgo-enabled binary to reproduce the build, but that's already not the case to begin with: we don't record the C compiler version, and we also don't stamp the versions of the headers in the user's C include path, nor do we stamp information about any C libraries that may have been statically linked into the binary.

bcmills commented 2 years ago

@gopherbot, please backport to Go 1.18. This is an unexpected side-effect of a change in Go 1.18, and interferes with build reproducibility.

gopherbot commented 2 years ago

Backport issue(s) opened: #53119 (for 1.18).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

gopherbot commented 2 years ago

Change https://go.dev/cl/409174 mentions this issue: cmd/go: omit CGO_*FLAGS variables from build metadata when -trimpath is set

jefferyto commented 2 years ago

Minimal example with -ldflags:

$ cat hello.go
package main

import "fmt"

func main() {
    fmt.Printf("hello, world\n")
}
$ go build -ldflags "-linkmode external -extldflags '-L/foo/bar'" hello.go 
$ go version -m hello
hello: go1.18.1
        path    command-line-arguments
        build   -compiler=gc
        build   -ldflags="-linkmode external -extldflags '-L/foo/bar'"
        build   CGO_ENABLED=1
        build   CGO_CFLAGS=
        build   CGO_CPPFLAGS=
        build   CGO_CXXFLAGS=
        build   CGO_LDFLAGS=
        build   GOARCH=amd64
        build   GOOS=linux
        build   GOAMD64=v1
gopherbot commented 2 years ago

Change https://go.dev/cl/414794 mentions this issue: [release-branch.go1.18] cmd/go: omit build metadata that may contain system paths when -trimpath is set