golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.15k stars 17.69k forks source link

cmd/go: build ID depends on the full path of Go files passed on the command line #48557

Closed abarisani closed 3 years ago

abarisani commented 3 years ago

When failing to create reproducible builds using go1.17.1, despite our best effort in ensuring that compiler and compilation flags are identical as well as GOROOT, we identified that the Build ID is dependent on the output path.

What version of Go are you using (go version)?

$ go version 1.17.1

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lcars/.cache/go-build"
GOENV="/home/lcars/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/lcars/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/lcars/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.1"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/mnt/git/public/crucible/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1855656758=/tmp/go-build -gno-record-gcc-switches"

What did you do?

$  cd /tmp/foo
$  CGO_ENABLED=0 go build -trimpath caam_tool.go && sha256sum caam_tool.go && sha256sum caam_tool
5ca6fadd863de23e3b6282d81d0af53ea328e0cdd60308299f0c6264e7de9236  caam_tool.go
8ce70082d6a861b9897a3d8b8ed6f98b18d71b764d8ca30109d016834d6e2486  caam_tool
$ cd /tmp/bar
$  CGO_ENABLED=0 go build -trimpath caam_tool.go && sha256sum caam_tool.go && sha256sum caam_tool
5ca6fadd863de23e3b6282d81d0af53ea328e0cdd60308299f0c6264e7de9236  caam_tool.go
47725ce3a6096352c9ec116a72bb423ccdb1652477bc871f3b5bb6b5076cdbf4  caam_tool

What did you expect to see?

We do not expect the build ID to be affected by the output path when building identical code under the same compiler version and flags.

What did you see instead?

The output binaries are identical with the sole exception of the build ID:

251,254c251,254
< 00000fa0: 5300 0000 0400 0000 476f 0000 4d56 7235  S.......Go..MVr5
< 00000fb0: 657a 497a 4f5a 334c 7959 5170 7748 4351  ezIzOZ3LyYQpwHCQ
< 00000fc0: 2f6d 3563 5a49 6265 2d72 5249 5938 5865  /m5cZIbe-rRIY8Xe
< 00000fd0: 4249 3846 452f 7a68 4e4d 646e 4b38 5045  BI8FE/zhNMdnK8PE
---
> 00000fa0: 5300 0000 0400 0000 476f 0000 337a 445a  S.......Go..3zDZ
> 00000fb0: 6a67 6a46 6e5a 5f62 4353 764e 4b58 704b  jgjFnZ_bCSvNKXpK
> 00000fc0: 2f49 5f59 6645 6470 4f34 796d 4937 776f  /I_YfEdpO4ymI7wo
> 00000fd0: 4e4b 6248 662f 7a68 4e4d 646e 4b38 5045  NKbHf/zhNMdnK8PE
ALTree commented 3 years ago

Previously: #33772.

You can use -ldflags=-buildid= (sets buildid to the empty string) as a temporary workaround, in the meantime.

dr2chase commented 3 years ago

Are you doing this with a go.mod file in the directory? And can you show the output of go tool buildid caam_tool, both ways?

@bcmills is this one for you?

abarisani commented 3 years ago

This happens regardless of go.mod or not, this specific example is not using any external packages (see caam_tool.go).

The buildid is the same shown in my report (the binary diff), anyway here it is:

$ cd /tmp/foo && go tool buildid caam_tool
3zDZjgjFnZ_bCSvNKXpK/I_YfEdpO4ymI7woNKbHf/zhNMdnK8PE2EMdNQVWjC/4co-SLWfCN7tlCcwHIGV

$ cd /tmp/bar &&  go tool buildid caam_tool
MVr5ezIzOZ3LyYQpwHCQ/m5cZIbe-rRIY8XeBI8FE/zhNMdnK8PE2EMdNQVWjC/4co-SLWfCN7tlCcwHIGV
seankhliao commented 3 years ago

I think it;s the way the arguments are passed (list of files), building a directory/package within a module results in consistent build ids

abarisani commented 3 years ago

The presence of a go.mod or not does not change the outcome.

If you are talking about modules installed with go install does do not enforce trimpath, so they would not get the same build ID (across different machines) for other reasons.

seankhliao commented 3 years ago

no, but go build -trimpath . (which is only possible when you have a go.mod) will result in the same build IDs

abarisani commented 3 years ago

With a go.mod'ed project I get different build IDs depending on the location:

$ cd /tmp/foo
$ git clone https://github.com/f-secure-foundry/crucible && cd crucible
$ make CGO_ENABLED=0 BUILD=test crucible
go build -v \
  -trimpath -ldflags "-s -w -X 'main.Revision=ac44e57' -X 'main.Build=test'" \
  cmd/crucible/crucible.go
command-line-arguments
compiled crucible ac44e57 (test)
$ sha256sum crucible 
64b97114614fa3315742fd61f3a0ae2ab5f65bb91550e1232650440d2b267981  crucible

$ cd /tmp/bar
$ git clone https://github.com/f-secure-foundry/crucible && cd crucible
$ make CGO_ENABLED=0 BUILD=test crucible
go build -v \
  -trimpath -ldflags "-s -w -X 'main.Revision=ac44e57' -X 'main.Build=test'" \
  cmd/crucible/crucible.go
command-line-arguments
compiled crucible ac44e57 (test)
$ sha256sum crucible 
b51b76f53279df695722ce54bbf079243414215b045174955c9f02584dab4e4a  crucible

However the earlier caam_tool.go example using go build -trimpath . after go mod init does lead to the same build.

The rules are not clear to me however, what's the difference between the two cases. In my mind, they should behave identically.

FiloSottile commented 3 years ago

We did some debugging with @abarisani and we realized it's not the output path that ends up hashed in the Build ID, but the full path of the input files, because go build cmd/crucible/crucible.go uses explicit Go files on the command line (the synthetic command-line-arguments package).

With -trimpath, the paths to these files should be made relative to the main module (assuming the files are within it). In fact, it looks like they are relative as embedded in the binary, because otherwise we'd see other differences, so it's probably a BuildID-specific issue.

(I wish we were more aggressive in discouraging the use of Go files on the command line. A lot of things, from build tags to multi-file packages if a file is left out, subtly breaks.)

FiloSottile commented 3 years ago

It would make this debugging much easier if go tool buildid had a -v flag that listed all that gets hashed. Should I open an issue for it?

jayconrod commented 3 years ago

Setting GODEBUG=gocachehash=1 when building a package should cause the go command to print everything that goes into cache keys. go tool buildid won't have that information. go help cache says a bit more about this.

jayconrod commented 3 years ago

I tried this in a quick example. Here's the cache key I'm seeing:

HASH[build command-line-arguments]
HASH[build command-line-arguments]: "devel go1.18-8ff94bbecb Wed Sep 22 13:48:21 2021 -0700"
HASH[build command-line-arguments]: "compile\n"
HASH[build command-line-arguments]: "goos linux goarch amd64\n"
HASH[build command-line-arguments]: "import \"command-line-arguments\"\n"
HASH[build command-line-arguments]: "omitdebug false standard false local true prefix \"_/home/jay/Code/test/a\"\n"
HASH[build command-line-arguments]: "trimpath\n"
HASH[build command-line-arguments]: "modinfo \"path\\tcommand-line-arguments\\n\"\n"
HASH[build command-line-arguments]: "compile tdLvclvHAV-OCkdMX9DV [] []\n"
HASH[build command-line-arguments]: "GOAMD64=v1\n"
HASH /home/jay/Code/test/a/main.go: 55a60bb97151b2b4b680462447ce60ec34511b14fa10d77440c97b9777101566
HASH[build command-line-arguments]: "file main.go VaYLuXFRsrS2gEYkR85g\n"
HASH[build command-line-arguments]: "import runtime -vV7n2SyuxUHQNgBVBri\n"
HASH[build command-line-arguments]: 7b6cdecf56c1e3618c9976560d07fa292928f32ba7b821fa31d29891f8114128

The bold string above is an absolute path to the package's local prefix, which is used to resolve relative imports like:

import "../foo"

Those are only allowed in packages specified as a list of files on the command line. I agree with @FiloSottile: I wish these weren't a thing.

I can't think of a good reason for the local prefix to be part of the cache key, and it doesn't really seem like it should be a distinct field from the package directory anyway. By the time the go command computes the cache key, it should already have resolved imports, and the output ids from those imports will be part of the cache key.

We do pass the local prefix to the compiler in gc.go. Not sure why the compiler needs it or if it affects the output.

bcmills commented 3 years ago

We do pass the local prefix to the compiler in gc.go. Not sure why the compiler needs it or if it affects the output.

Let's try not doing that and see what breaks! 😁

dr2chase commented 3 years ago

The flag is consumed here: https://cs.opensource.google/go/go/+/refs/tags/go1.17.1:src/cmd/compile/internal/noder/import.go;l=160 Any absolute paths are certainly visible to the compiler; w/o them, debugging gets a little harder.

bcmills commented 3 years ago

Yeah. Perhaps we can start by just not passing that flag when -trimpath is set.

jayconrod commented 3 years ago

@dr2chase Does the -D flag have any influence on the output, or is it only used in error messages?

I'd expect that cmd/go could pass in an importcfg file that just maps import strings to .a file paths. Ideally the compiler wouldn't need -D at all when -importcfg is used.

dr2chase commented 3 years ago

I am not sure. There's a lot going on. The fact that debugging seems to be able to find the source file (in the default case, at least) suggests that at least some absolute paths are embedded in the debugging information. E.g.:

Breakpoint 1 (enabled) set at 0x1054c8a for main.main() /Users/drchase/work/wtf/wtf.go:8

(this in a an executable that I copied into /tmp/bar, also debugged in that directory)

dr2chase commented 3 years ago

But with -go build -trimpath ., it declares that the file is in wtf/wtf.go

dr2chase commented 3 years ago

The leading wtf is the directory prefix, not the module name. I am not sure why, if we put a directory there, wtf is any better than ..

Debugging in the directory where I built it, delve is kinda puzzled:

cd $wtf
# drchase-macbookpro1 ~/work/wtf 21-09-23 17:47:52
dlv exec ./wtf
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 (enabled) set at 0x1054c8a for main.main() wtf/wtf.go:8
(dlv) c
> main.main() wtf/wtf.go:8 (hits goroutine(1):1 total:1) (PC: 0x1054c8a)
Warning: debugging optimized function
(dlv) list
> main.main() wtf/wtf.go:8 (hits goroutine(1):1 total:1) (PC: 0x1054c8a)
Warning: debugging optimized function
Command failed: open wtf/wtf.go: not a directory
gopherbot commented 3 years ago

Change https://golang.org/cl/352810 mentions this issue: cmd/go: do not pass a local prefix to the compiler in module mode

bcmills commented 3 years ago

CL 352810 should fix this in module mode.

A fix in GOPATH mode would be much more invasive and doesn't seem worth doing: as far as I can tell this isn't a regression from previous Go releases, and triggering the problematic behavior would require the combination of GOPATH mode and local imports — each of which should be rare and becoming rarer.

bcmills commented 3 years ago

This is now fixed in module mode, and I do not plan to address it in GOPATH mode, so closing as “as fixed as it's going to be”.