Closed qiulaidongfeng closed 11 months ago
CC @golang/windows, @bcmills.
I suspect that this is a bug in work.Builder.gccToolID
.
We invoke it to compute the cache ID, but if it fails we currently omit that ID from the cache inputs instead of erroring out or caching the error text: https://cs.opensource.google/go/go/+/master:src/cmd/go/internal/work/exec.go;l=292-296;drc=4956c3437bd2f4448bcec51321f123d03731ddfc
That seems like a bug.
I'm not able to reproduce. gotip test -v cmd/go -run=TestScript/cgo_stale_precompiled -count 1
passes for me with go version devel go1.22-7ccddf040a Tue Nov 28 16:40:32 2023 +0000 windows/amd64
using winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3
.
Maybe the output of gcc -### -x c -c -
with LC_ALL=C
on the machine is unexpected due to some other configuration/env that isn't accounted for?
Just after posting that, I realized I forgot to configure CC=clang
. Now it repros! It looks like it can't find version
hidden far in the first line.
$ clang -### -x c -c -
(built by Brecht Sanders) clang version 14.0.6
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/bin
(in-process)
"C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/bin/clang.exe" "-cc1" "-triple" "x86_64-w64-windows-gnu" "-emit-obj" "-mrelax-all" "--mrelax-relocations" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "pic" "-pic-level" "2" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-mms-bitfields" "-funwind-tables=2" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3" "-resource-dir" "C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/lib/clang/14.0.6" "-internal-isystem" "C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/lib/clang/14.0.6/include" "-internal-isystem" "C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/x86_64-w64-mingw32/sys-root/mingw/include" "-internal-isystem" "C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/x86_64-w64-mingw32/include" "-internal-isystem" "C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3/mingw64/include" "-fdebug-compilation-dir=C:/tools/mingw/winlibs/winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3" "-ferror-limit" "19" "-fmessage-length=210" "-fno-use-cxa-atexit" "-fgnuc-version=4.2.1" "-exception-model=seh" "-fcolor-diagnostics" "-faddrsig" "-o" "-.o" "-x" "c" "-"
Yep, it's not expecting the (built by Brecht Sanders)
prefix there. š¤
I think https://github.com/llvm/llvm-project/blob/3a6f02a6581b49b269710eea944dc114166403ed/clang/lib/Basic/Version.cpp#L95-L97 means that any build of clang that sets VENDOR
will have a bespoke prefix, not great for this type of detection. š
An official clang is just clang version 17.0.1
. Since LLVM provides builds for Windows, it may be rare to see VENDOR being used here.
On Mac, https://stackoverflow.com/a/70697937 says its clang (Xcode's build of clang?) may give you Apple LLVM version 9.1.0 (clang-902.0.39.1)
, and another answer mentions Apple clang version 11.0.0 (clang-1100.0.33.16)
. I'm curious if this test failure would also repro on a Mac, but I don't have one. I'm surprised this wouldn't have already been reported, so maybe this is all wrong.
I haven't traced the code down to know for certain that this is the code path being used, it just seems reasonable.
I assume the buildid.go loop only checks fields 1 and 2 to avoid false positives. I'm not sure what a good fix would be without risk. Keep track of a loose match in the loop so if no perfect match is found, it falls back to the first line with a version
field anywhere?
I'm curious if this test failure would also repro on a Mac, but I don't have one. I'm surprised this wouldn't have already been reported, so maybe this is all wrong.
It appears to be passing on the Go project's macOS builders ā I see a passing run in https://results.usercontent.cr.dev/invocations/build-8763160122012974529/tests/cmd%2Fgo.TestScript%2Fcgo_stale_precompiled/results/bc3b68df-24354/artifacts/output?token=AXsiX2kiOiIxNzAxMjA3OTY1NjgwIiwiX3giOiIzNjAwMDAwIn2mPsMH_r8cNPO3GylSLoU3W2t2sCM_9iRp8sl88xW4jQ, for example.
I suspect that's because strings.Fields
treats Apple
as only a single token. Probably we should be counting tokens from the end of the line instead of the beginning?
Ah, yep, I got mixed up, it makes sense that it works in those cases. Also didn't realize those builders used clang!
Probably we should be counting tokens from the end of the line instead of the beginning?
Maybe, but I've also seen a varying number of spaces from getClangFullRepositoryVersion
:
clang version 3.8.0 (http://llvm.org/git/clang.git e2c22771566f2f28db1f4a9b2b0191ff9f2ae90c) (http://llvm.org/git/llvm.git 7a2d52ce5de3372b030c296bbfe70dedb29b46d3)
, and using multiple repos seems to still be supported in that code.Apple LLVM version 6.0 (clang-600.0.34.4) (based on LLVM 3.5svn)
. That's old, but it seems possible someone's still putting extra spaces into LLVM_REVISION
somewhere to try to make it human readable.However, could chain these together: look for version
as field 1 or 2, or len-2, len-3, len-4, then pick any line that has version
as a field.
I noticed that you can already force a false positive with the current logic by putting the compiler in a directory with <space>version<space>
in its name, like this:
$ clang.exe -### -x c -c -
(built by Brecht Sanders) clang version 14.0.6
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: C:/tools/mingw/winlibs/my version selection/[...]
(in-process)
[...]
The test passes, but the logic seems like it must be picking up the InstalledDir line. The full compiler path is a decent amount of information, so maybe it's ok for this purpose anyway.
I'm wondering now if there's something different that can be done here, like run the clang preprocessor to unambiguously grab __clang_version__
. (Although that specifically misses the vendor, which might be important to trigger rebuilds when using multiple C compilers on the same machine.)
I'm wondering now if there's something different that can be done here, like run the clang preprocessor to unambiguously grab
__clang_version__
. (Although that specifically misses the vendor[...]
Ah, of course we could get the full string if we get __clang_version__
by preprocessing and run clang.exe -### -x c -c -
, then look for the first line that includes version <__clang_version__>
, if it's worth accounting for that.
Huh. I can also reproduce this failure with Clang on Linux:
~/go-review/src$ go env
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/usr/local/google/home/bcmills/.cache/go-build'
GOENV='/usr/local/google/home/bcmills/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/usr/local/google/home/bcmills/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/usr/local/google/home/bcmills'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/google/home/bcmills/go-review'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/usr/local/google/home/bcmills/go-review/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.22-818de275c0 Mon Dec 4 19:27:31 2023 +0000'
GCCGO='/usr/bin/gccgo'
GOAMD64='v1'
AR='ar'
CC='/usr/bin/clang'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/usr/local/google/home/bcmills/go-review/src/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build593432539=/tmp/go-build -gno-record-gcc-switches'
~/go-review/src$ $(go env CC) --version
Debian clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
~/go-review/src$ go test cmd/go -run=TestScript/cgo_stale_precompiled -count=1
vcs-test.golang.org rerouted to http://127.0.0.1:41485
https://vcs-test.golang.org rerouted to https://127.0.0.1:43539
go test proxy running at GOPROXY=http://127.0.0.1:42183/mod
--- FAIL: TestScript (0.03s)
--- FAIL: TestScript/cgo_stale_precompiled (4.60s)
script_test.go:132: 2023-12-05T17:50:42Z
script_test.go:134: $WORK=/tmp/cmd-go-test-3204175725/tmpdir82752789/cgo_stale_precompiled3093312852
script_test.go:156:
# Regression test for https://go.dev/issue/47215 and https://go.dev/issue/50183:
# A mismatched $GOROOT_FINAL or missing $CC caused the C dependencies of the net
# package to appear stale, and it could not be rebuilt due to a missing $CC. (0.000s)
# This test may start with the runtime/cgo package already stale.
# Explicitly rebuild it to ensure that it is cached.
# (See https://go.dev/issue/50892.)
#
# If running in non-short mode, explicitly vary CGO_CFLAGS
# as a control case (to ensure that our regexps do catch rebuilds). (4.363s)
# https://go.dev/issue/50183: a mismatched GOROOT_FINAL caused net to be stale. (0.103s)
# https://go.dev/issue/47215: a missing $(go env CC) caused the precompiled net
# to be stale. But as of https://go.dev/cl/452457 the precompiled libraries are
# no longer installed anyway! Since we're requiring a C compiler in order to
# build and use cgo libraries in the standard library, we should make sure it
# matches what's in the cache. (0.102s)
> [!abscc] env CGO_ENABLED=1
> [!abscc] [!GOOS:plan9] env PATH='' # Guaranteed not to include $(go env CC)!
> [!abscc] [GOOS:plan9] env path=''
[condition not met]
> [!abscc] ! go build -x runtime/cgo
[stderr]
WORK=/tmp/cmd-go-test-3204175725/tmpdir82752789/cgo_stale_precompiled3093312852/tmp/go-build4109618470
script_test.go:156: FAIL: testdata/script/cgo_stale_precompiled.txt:35: go build -x runtime/cgo: unexpected success
FAIL
FAIL cmd/go 4.742s
FAIL
Now I'm wondering why it's not failing on the linux-amd64-clang
builder. š
That's why: its clang
is very old.
root@buildlet-linux-amd64-clang-rn5d265e6:~# clang --version
clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Debian clang version 14.0.6
Shouldn't this have worked because version
is fields[2]
? Or is this a different root cause?
Either way, I guess it indicates a better general solution is needed.
Marking as release-blocker because this test failure indicates a substantial compatibility problem with modern versions of a widely-used C toolchain.
I think the full path in CC
might be the cause in that case--here's what I get on Linux:
$ bin/go version
go version devel go1.22-af5d544b6d Tue Dec 5 18:18:13 2023 +0000 linux/amd64
$ clang --version
clang version 15.0.7 (Fedora 15.0.7-2.fc37)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ CC=gcc bin/go test cmd/go -run=TestScript/cgo_stale_precompiled -count=1
ok cmd/go 2.403s
$ CC=clang bin/go test cmd/go -run=TestScript/cgo_stale_precompiled -count=1
ok cmd/go 2.702s
$ CC=/usr/bin/clang bin/go test cmd/go -run=TestScript/cgo_stale_precompiled -count=1
script_test.go:156: FAIL: testdata/script/cgo_stale_precompiled.txt:35: go build -x runtime/cgo: unexpected success
...
Ah, that one is a test bug: we were forgetting to remove an explicit $CC
from the environment. That seems to be separate from the cache-key bug, which I can reproduce by injecting a prefix to the version string using a -toolexec
wrapper.
I have a fix, but it's hard to write a regression test because of other bugs (#64580). š
Change https://go.dev/cl/547998 mentions this issue: cmd/go: accept clang versions with vendor prefixes
Change https://go.dev/cl/547997 mentions this issue: cmd/go: unset CC when we remove it from PATH in TestScript/cgo_stale_precompiled
Change https://go.dev/cl/548120 mentions this issue: cmd/go: relax version regexp from CL 547998
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
clang get from https://winlibs.com/ GCC 12.1.0 + LLVM/Clang/LLD/LLDB 14.0.6 + MinGW-w64 10.0.0 (MSVCRT) - release 3
In cmd cd $GOROOT/src
go test -v cmd/go -run=TestScript/cgo_stale_precompiled
What did you expect to see?
test pass.
What did you see instead?
Output