golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.32k stars 17.47k forks source link

cmd/dist: cross-libc same os/arch build failing since go 1.10 #25177

Open anisse opened 6 years ago

anisse commented 6 years ago

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.1 linux/amd64

I also have reproduced this with the master branch from a few days ago.

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64" GOBIN="" GOCACHE="/home/anisse/.cache/go-build" GOEXE="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux" GOPATH="/home/anisse/bin/prefix/gopath" GORACE="" GOROOT="/home/anisse/dev/buildroot/musl_min/host/lib/go" GOTMPDIR="" GOTOOLDIR="/home/anisse/dev/buildroot/musl_min/host/lib/go/pkg/tool/linux_amd64" GCCGO="gccgo" CC="/usr/bin/gcc" CXX="/usr/bin/g++" CGO_ENABLED="0" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build290222412=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Since go 1.10, cross-building go for a different libc on the same arch and os, stopped working. This is due to the new build caching infrastructure calling "go list" during the build.

Technically, go made the decision long ago (why?) to always use $CC even if $CC_FOR_TARGET is available when gohostos=goos and gohostarch=goarch. This worked until the new go caching infrastructure changed way go is built, and "go list" started to be called during build.

Note: obviously, this only happens when building with CGO_ENABLED=1.

I was able to workaround this by removing the legacy build behaviour: https://github.com/golang/go/blob/master/src/cmd/dist/build.go#L258-L260

What did you expect to see?

Building go works.

What did you see instead?

Building go failed.

This can be seen in the buildroot autobuild logs for musl and uclibc x86_64 targets: http://autobuild.buildroot.net/results/2ba/2ba1eee95bdad910549ef64fc9a075d2136b4936/build-end.log

It tries calling "go list" built using the cross-libc compiler and fails with "no such file or directory" since it's not the same ld linker.

You can find other uclibc/musl failed logs here: http://autobuild.buildroot.net/?start=100&reason=host-go%

As well as the buildroot host-go package: https://git.buildroot.net/buildroot/tree/package/go/go.mk (requires knowledge of buildroot).

anisse commented 6 years ago

Cc @angeloc @tpetazzoni

andybons commented 6 years ago

@ianlancetaylor

ianlancetaylor commented 6 years ago

Your log (http://autobuild.buildroot.net/results/2ba/2ba1eee95bdad910549ef64fc9a075d2136b4936/build-end.log) seems to show that you are building the same version of Go with the same GOROOT and GOBIN with different values of CC_FOR_TARGET and CXX_FOR_TARGET. That doesn't seem useful. What am I missing?

The error I see in that log is

fork/exec /accts/mlweber1/rclinux/rc-buildroot-test/scripts/instance-2/output/build/host-go-1.10/bin/go: no such file or directory

Does that file exist? Or does the error indicate that it is trying to use a dynamic linker that does not exist?

ianlancetaylor commented 6 years ago

Have you considered adding go clean -cache in between your two invocation of make.bash?

anisse commented 6 years ago

I think it's built twice as a workaround to precisely this issue: https://github.com/buildroot/buildroot/commit/60c5c96ae109dcea48e0997b5e6e6645ac9b8629#diff-bcd31d4b5367d8b956f86a366dae86f5

The file exists but fork/exec can't run it because of a dynamic linker that does not exists.

I thought of building with GOCACHE=off, but not to clean between the two builds, I'll try that.

anisse commented 6 years ago

go clean -cache doesn't help. The second time we build is in fact what we really need: a go binary that can run on the host, but build binaries aimed for the target.

We could obtain this in one pass if $CC_FOR_TARGET wasn't used for $CC when gohostos=goos and gohostarch=goarch here: https://github.com/golang/go/blob/master/src/cmd/dist/build.go#L249-L260

Instead, the buildroot maintainers did this dance where the go bin and pkg/tool are built once for the host (the first time), and then the lib, pkg/linux_* are built for the target. This workaround used to work until the caching infrastructure started calling the go binary for the current build when building packages.

If we ignore this workaround, we see that if we run only the "second" pass from a clean environment, it doesn't even complete (it used to, before go 1.10); and that is the problem I'm reporting here. If we could also get rid of the workaround, it would be great.

ianlancetaylor commented 6 years ago

It sounds like you are doing a native build using a C compiler that doesn't generate working executables. I would not expect that to work. I understand that perhaps it used to work, but I think that was by accident.

I think the code in cmd/dist is correct: if you are doing a native build, then CC_FOR_TARGET is the C compiler that is correct for the native build. You are suggesting that we draw a distinction for a native build between CC and CC_FOR_TARGET, but I don't see anything to base that on. We're going to go ahead and run native tests, and at that point we expect to have a working C compiler.

It's not clear to me that this has anything to do with the caching infrastructure as such.

Perhaps I misunderstand but it seems to me that you are asking for a new feature (that used to work accidentally): the ability to do a native build with a non-working C compiler with CGO_ENABLED=1. Something like GO_ASSUME_CROSSCOMPILING=1.

anisse commented 6 years ago

I think we've reached the crux of the issue. What you call a native build, is only a native build in a static-only world; here it is cross-compilation since the go bootstrap relies on the C toolchain (when CGO_ENABLED=1), which targets a different machine; this machine only happens to be running the same OS/arch and a different libc. This type of cross compilation already works with hundreds of packages in buildroot, and even more in the wild.

Note that we call make.bash here, not all.bash, so the tests don't run.

I did not mean to say it had specifically to do with the caching infrastructure; I wanted to say its introduction brought a regression.

We can agree to call this a feature request instead of a regression; the reason I opened this bug was to make sure the Go team is aware of how people use it in the wild, and then can make a decision to reject or accept this (already existing) use case. This specific issue might be worked around with a four-lines patch.

Whether to add another environment variable or not is a matter of taste; I think it would be better to remove the specific of handling of gohostos=goos and gohostarch=goarch to detect a native build.

ianlancetaylor commented 6 years ago

Thanks, but I don't see how it could be correct to remove the special handling of a native build. A native build really is special. I think the issue here is that in your case, which is an unusual one, simply testing GOHOSTOS == GOOS && GOHOSTARCH == GOARCH is not sufficient to detect whether this is a native build or not.

gopherbot commented 6 years ago

Change https://golang.org/cl/112156 mentions this issue: build.go: explicit option for crosscompilation

ianlancetaylor commented 6 years ago

CL 112156 fixes this by adding a new environment variable which can be set to indicate that we are cross-compiling. Marking as needs-decision.

rsc commented 6 years ago

The build understands at a fairly fundamental level that cross builds happen due to GOHOSTOS != GOOS or GOHOSTARCH != GOARCH. You're asking for a third condition, and I'm not convinced that Ian's CL is enough with the cache. Honestly the best workaround might be to set GOHOSTARCH=386, since then the go command will understand that you are cross-compiling (assuming your 64-bit system can run 32-bit binaries).

The fact that this "used to work" before build caching was accidental at best. I'm not sure what we should do moving forward. If you have a different workaround, please feel free to use that for now too.

anisse commented 6 years ago

@rsc I don't understand what's missing from @angeloc's CL that would not be enough with the cache ? This seemed enough to generate a proper cross-toolchain, and then use it to compile working binaries.

Any test we might run to see what's missing ?

ianlancetaylor commented 6 years ago

@anisse @rsc's point is that the go tool is generally designed on the assumption that if you have the same GOROOT and GOARCH then you have the same kind of system. Since that assumption is part of the design of the go tool, we don't know all the places where changing that assumption might fail.