Open bjohnso5 opened 3 weeks ago
I got the error when build arm64 image in amd host machine with buildx
docker buildx create --use --name=baker --driver docker-container --platform=linux/amd64 --platform=linux/arm64
docker buildx build --builder baker --platform=linux/amd64 --platform=linux/arm64 -t {tag} --push .
then I tried to run manually with docker run -it --rm --platform linux/arm64 {tag}
after unzip the command, I got the same error can't start telemetry child process: fork/exec /usr/local/go/bin/go: invalid argument
, but when I exec the chmod a+x ${GOROOT}/bin/*
, it works without any permission changes. However, after apply this command to Dockerfile,the error was not dealed
Dockerfile example:
FROM almalinux:9.4-20240530
ENV GOROOT=/usr/local/go \
GOLANG_VERSION=1.23.0 \
GOPATH=/go
ENV PATH=$GOPATH/bin:$PATH:$GOROOT/bin
RUN set -eox pipefail \
&& dnf install -y curl \
&& mkdir -p "${GOROOT}" "$GOPATH/src" "$GOPATH/bin" && chmod -R 1777 "$GOPATH" \
&& curl -sSL "https://go.dev/dl/go${GOLANG_VERSION}.linux-$(cat < /etc/arch).tar.gz" | tar -zxvC ${GOROOT} --strip-components=1 \
# && chmod a+x ${GOROOT}/bin/* \
&& go version
WORKDIR $GOPATH
With the circleci dockerfile, I get a segfault in gcc cc1 rather than something in go directly:
> [linux/arm64 5/5] RUN GO install "golang.org/x/vuln/cmd/govulncheck@v1.1.3" && go clean -cache -modcache && rm -rf "/home/circleci/go/pkg":
0.120 + go install golang.org/x/vuln/cmd/govulncheck@v1.1.3
0.528 go: downloading golang.org/x/vuln v1.1.3
1.116 go: downloading golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7
1.120 go: downloading golang.org/x/mod v0.19.0
1.120 go: downloading golang.org/x/tools v0.23.0
1.171 go: downloading golang.org/x/sync v0.7.0
50.74 # net
50.74 gcc: internal compiler error: Segmentation fault signal terminated program cc1
50.74 Please submit a full bug report,
50.74 with preprocessed source if appropriate.
50.74 See <file:///usr/share/doc/gcc-11/README.Bugs> for instructions.
------
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
2 warnings found (use docker --debug to expand):
- LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 27)
- LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 28)
Dockerfile:48
--------------------
46 | USER circleci
47 |
48 | >>> RUN go install "golang.org/x/vuln/cmd/govulncheck@v${GOVULNCHECK_VERSION}" && go clean -cache -modcache && rm -rf "${GOPATH}/pkg"
49 |
--------------------
ERROR: failed to solve: process "/dev/.buildkit_qemu_emulator /bin/bash -exo pipefail -c go install \"golang.org/x/vuln/cmd/govulncheck@v${GOVULNCHECK_VERSION}\" && go clean -cache -modcache && rm -rf \"${GOPATH}/pkg\"" did not complete successfully: exit code: 1
Might this have something to do with apparmor or other host "controls"? I was able to run the multiarch build on an Ubuntu 22.04 host using Docker 24.0.7 (from Ubuntu's packages) without errors, and inside the resulting arm64 container was able to run without errors:
Could you run the failing command under strace -F
so we can see exactly which system call is failing?
cc @golang/telemetry
CC @matloob
Independent of the root cause, a failure to start the telemetry child process shouldn't prevent the go command from being used.
Could you run the failing command under
strace -F
so we can see exactly which system call is failing?
It appears the ptrace function(s) aren't implemented in the emulation environment:
Not sure if this is helpful, but I'm attaching two strace -f
output files from the linux/arm64 golang:1.23.0 and golang:1.22.6 official images running go env
. Note that these are in the successful case, but I'm hoping it might help with comparison if required.
Moved to Go1.24 milestone since this need to be fixed on the main branch first (for Go 1.24), before being considered for backporting. Please use the usual process (https://go.dev/wiki/MinorReleases) to create a separate backport tracking issue in the Go1.23.1 milestone.
@findleyr It's important that issues in the minor milestones are the backport kind with a CherryPickCandidate label, otherwise we might miss them in our release meeting review. Thanks.
Thanks again @dmitshur.
@gopherbot please backport this issue to 1.23: it is a regression that breaks the go command in certain environments.
Backport issue(s) opened: #68995 (for 1.23).
Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
Change https://go.dev/cl/607595 mentions this issue: telemetry: do not crash parent if child could not be started
Change https://go.dev/cl/609195 mentions this issue: gopls: update x/telemetry to pick up recent bug fixes
Change https://go.dev/cl/609196 mentions this issue: [gopls-release-branch.0.16] gopls: update x/telemetry to pick up recent bug fixes
Change https://go.dev/cl/609256 mentions this issue: cmd: vendor golang.org/x/telemetry@e553cd4b
Change https://go.dev/cl/609136 mentions this issue: [internal-branch.go1.23-vendor] telemetry: do not crash parent if child could not be started
Change https://go.dev/cl/609237 mentions this issue: cmd: vendor golang.org/x/telemetry@a797f33
Might this have something to do with apparmor or other host "controls"?
I can reproduce with both apparmor and seccomp explicitly disabled (this is on Debian Stable with Debian's qemu-user-static
package installed):
$ docker run --rm --pull=always --platform linux/arm64/v8 --security-opt seccomp=unconfined --security-opt apparmor=unconfined golang:1.23 go version
1.23: Pulling from library/golang
Digest: sha256:613a108a4a4b1dfb6923305db791a19d088f77632317cfc3446825c54fb862cd
Status: Image is up to date for golang:1.23
WARNING: image with reference golang was found but does not match the specified platform: wanted linux/arm64/v8, actual: linux/amd64
can't start telemetry child process: fork/exec /usr/local/go/bin/go: invalid argument
Could you run the failing command under
strace -F
so we can see exactly which system call is failing?It appears the ptrace function(s) aren't implemented in the emulation environment:
I've attached a full log with QEMU_STRACE=1
set (which is apparently the way to strace
these QEMU calls correctly):
For comparison, here's the same log but on the (working) 1.22 release: go-version-strace.log
Thanks for the logs. The offending call is:
1 pidfd_open(1,0) = 9
1 pidfd_send_signal(9,0,NULL,0) = 0
1 clone(CLONE_VM|CLONE_VFORK|0x1011,child_stack=0x0000000000000000,parent_tidptr=0x00000040002872b8,tls=0x0000000000000000,child_tidptr=0x0000000000000000) = -1 errno=22 (Invalid argument)
Flag 0x1000
is CLONE_PIDFD
. I'd assume that is the flag QEMU is complaining about.
What version of QEMU are you using? CLONE_PIDFD
support appears to be added in https://github.com/qemu/qemu/commit/895ce8bb534e66ca418dea62ae67a92dccafb2e1 (QEMU 8.0).
We test that syscalls pidfd_open
and pidfd_send_signal
work before attempting to use CLONE_PIDFD
. In Linux, support of those guarantees support for CLONE_PIDFD
, but perhaps not in QEMU?
It looks like pidfd_open
, etc were added in https://github.com/qemu/qemu/commit/cc054c6f139cf54ce8fbefd6fd536f50b4cba694 (QEMU 7.2), prior to CLONE_PIDFD
...
@tianon Could you try cherry-picking https://go.dev/cl/592077 and https://go.dev/cl/592078 to see if they fix the issue?
You can get a cherry pick command from Gerrit from the upper right "..." menu -> "Download patch".
Change https://go.dev/cl/609355 mentions this issue: [release-branch.go1.23] cmd: vendor golang.org/x/telemetry@internal-branch.go1.23-vendor
What version of QEMU are you using?
For my working report, I've got Ubuntu 22.04's build 1:6.2+dfsg-2ubuntu6.22
(i.e. 6.x), so before any of the pidfd support from the sounds of it.
The CircleCI folks can confirm (I'm just an interested user/customer of theirs) but it looks like their build process is using multiarch/qemu-user-static:latest which hasn't been updated in 2 years (!) and reports running version 7.2.0 (Debian 1:7.2+dfsg-1~bpo11+2)
, which is in that "inverted support ordering" window
Change https://go.dev/cl/609596 mentions this issue: [gopls-release-branch.0.16] update telemetry to match Go 1.23.1
https://github.com/golang/go/commit/4f852b9734249c063928b34a02dd689e03a8ab2c definitely doesn't fix this issue.
@tianon can you say more? That change should have avoided failing the Go binary when the telemetry child process fails to start. I agree it doesn't fix the underlying issue. Is that what you meant?
Change https://go.dev/cl/609635 mentions this issue: gopls: update x/telemetry dependency
Yes, I mean this issue shouldn't be closed by that. This issue affects more than just telemetry.
On the machine where I can reproduce, I'm on QEMU version 7.2.11 (Debian Bookworm's 1:7.2+dfsg-7+deb12u6
). On another machine, I'm running Docker Desktop which apparently is using version 8.1.5 and works fine (which I think all tracks with your digging).
I've tried building 9e8ea567c838574a0f14538c0bbbd83c3215aa55 (which is current HEAD
at the time of this testing), even using Go 1.22 as bootstrap, and it fails almost right away, as expected (go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
).
After cherry-picking both those CLs:
$ git fetch https://go.googlesource.com/go refs/changes/77/592077/2
$ git cherry-pick FETCH_HEAD
Auto-merging src/os/pidfd_other.go
[detached HEAD 918ee84746] os: error if CLONE_PIDFD does not work properly
Author: Michael Pratt <mpratt@google.com>
Date: Tue Jun 11 11:10:00 2024 -0400
4 files changed, 18 insertions(+), 8 deletions(-)
$ git fetch https://go.googlesource.com/go refs/changes/78/592078/2
$ git cherry-pick FETCH_HEAD
[detached HEAD 2574d2e27ed] os: add clone(CLONE_PIDFD) check to pidfd feature check
Author: Michael Pratt <mpratt@google.com>
Date: Tue Jun 11 16:34:38 2024 -0400
2 files changed, 97 insertions(+), 3 deletions(-)
The build worked great right up until Building Go toolchain3 using go_bootstrap and Go toolchain2.
at which point it seemed to be stuck, so I left it all night and now hours and hours later it's still just sitting there. :disappointed:
Tried again, because I noticed I was setting GOHOST*
:upside_down_face:
Got a successful build with those two CLs (cherry-picked exactly as described in my comment above), and the build appears to work correctly:
$ /usr/local/go/bin/linux_arm64/go version
go version devel go1.24-d53c47bd71a Thu Aug 29 16:55:15 2024 +0000 linux/arm64
@tianon It sounds like this is fixed? I'll mark this closed.
No, only if both https://go.dev/cl/592077 and https://go.dev/cl/592078 are merged (that's what I was testing, per request from @prattmic above in https://github.com/golang/go/issues/68976#issuecomment-2316288980).
Ah, sorry for the misunderstanding.
For comparison, here's a fully stock build (ie, NOT cherry-picking those two CLs) of 00c48ad6155a209841dbfb6154f650c622aaa10b (again, current HEAD
at the time of my testing):
$ /usr/local/go/bin/linux_arm64/go version
can't start telemetry child process: fork/exec /usr/local/go/bin/linux_arm64/go: invalid argument
go version devel go1.24-00c48ad Thu Aug 29 18:03:48 2024 +0000 linux/arm64
So the telemetry doesn't make go version
totally fail anymore, but attempting to do anything more complex then fails at the first exec
such as trying to invoke compile
:
$ /usr/local/go/bin/linux_arm64/go install golang.org/x/vuln/cmd/govulncheck@latest
go: downloading golang.org/x/vuln v1.1.3
go: downloading golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7
go: downloading golang.org/x/mod v0.19.0
go: downloading golang.org/x/tools v0.23.0
go: downloading golang.org/x/sync v0.7.0
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
Yeah, I didn't think closely enough about this initially. I thought the code starting the telemetry subprocess was doing something special, but the real bug here is that any os.StartProcess
/ os/exec
is broken.
Hi everyone!
Have the same issue. Is there any possibility to switch of the telemetry fork process for a while?
Turning off telemetry doesn't fix it -- it's literally os/exec
itself that's broken (so if it isn't telemetry triggering it, it'll be forking the compile
process instead), as noted in the last few comments before yours. :see_no_evil:
Turning off telemetry doesn't fix it -- it's literally
os/exec
itself that's broken (so if it isn't telemetry triggering it, it'll be forking thecompile
process instead), as noted in the last few comments before yours. 🙈
is it possible to use lower version of golang to avoid this error?
Turning off telemetry doesn't fix it -- it's literally
os/exec
itself that's broken (so if it isn't telemetry triggering it, it'll be forking thecompile
process instead), as noted in the last few comments before yours. 🙈is it possible to use lower version of golang to avoid this error?
This is a 1.23 / dev 1.24 bug. Yes, if one is running into this (as I did), they could consider downgrading to 1.22 until 1.23/dev 1.24 is fixed.
FWIW, Debian bookworm has QEMU 7.2 in its repository, which is probably why so many people have hit this.
To reproduce this issue:
go.mod
:
module example.com/app
go 1.23
main.go
:
package main
import (
"fmt"
"os"
"os/exec"
)
func main() {
if os.Getenv("TEST_SUBPROCESS") == "1" {
fmt.Println("Hello from child")
return
}
exe, err := os.Executable()
if err != nil {
panic(err)
}
cmd := exec.Command(exe)
cmd.Env = append(cmd.Environ(), "TEST_SUBPROCESS=1")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
panic(err)
}
}
Dockerfile
:
FROM debian:bookworm
RUN apt-get update && apt-get install -y qemu-user-static
COPY app app
CMD ["qemu-x86_64-static", "app"]
$ go build # outside the container
$ docker build -t issue68976_qemu .
$ docker run --rm -it issue68976_qemu
Without https://go.dev/cl/592078:
panic: fork/exec /app: invalid argument
goroutine 1 [running]:
main.main()
/usr/local/google/home/mpratt/Downloads/issue68976_qemu/main.go:25 +0x17a
With https://go.dev/cl/592078:
Hello from child
Having similar issue here https://github.com/stakwork/sphinx-tribes/actions/runs/10724756996/job/29741116565
will update if we find a work around
Update: we fixed with this Dockerfile change: https://github.com/stakwork/sphinx-tribes/commit/acca2f6def2160f78cd1d77f16d6174d35d3677e
We're also having this issue with Go 1.23.1, using a Bookworm-based image. This includes both the can't start telemetry child process
and error obtaining buildID for go tool compile
errors. It would be much appreciated to have this fix backported to Go 1.23.x.
Same here. Building on Debian Host, but the images was a Alpine based one. Have this error, since 1.23.1. Did not had this error with 1.23.0.
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
@alicethorne-ab I'd like to confirm: are you seeing a fatal can't start telemetry child process
error? In 1.23.1 we still log the error, but it shouldn't be a fatal error anymore.
I have the same problem. I can see the error can't start telemetry child process
in my log, but as you said, it is not fatal, so it is working properly.
Unfortunately, now I see the same error @the-hotmann is referring to when I try to use go install
or go run
. That error is fatal (exit code 1).
$ go run ./test.go
can't start telemetry child process: fork/exec /usr/local/go/bin/go: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
$ cat test.go
package main
import "fmt"
func main() {
fmt.Println("Hello world");
}
$ arch
aarch64
Here is the output of go install
:
$ go install github.com/amacneil/dbmate/v2@v2.4.0
go: downloading github.com/amacneil/dbmate/v2 v2.4.0
go: downloading github.com/joho/godotenv v1.5.1
go: downloading github.com/urfave/cli/v2 v2.25.5
go: downloading github.com/lib/pq v1.10.9
go: downloading github.com/go-sql-driver/mysql v1.7.1
go: downloading github.com/ClickHouse/clickhouse-go/v2 v2.10.0
go: downloading github.com/ClickHouse/ch-go v0.56.0
go: downloading github.com/andybalholm/brotli v1.0.5
go: downloading github.com/pkg/errors v0.9.1
go: downloading go.opentelemetry.io/otel/trace v1.16.0
go: downloading go.opentelemetry.io/otel v1.16.0
go: downloading github.com/google/uuid v1.3.0
go: downloading github.com/paulmach/orb v0.9.2
go: downloading github.com/shopspring/decimal v1.3.1
go: downloading gopkg.in/yaml.v3 v3.0.1
go: downloading github.com/cpuguy83/go-md2man/v2 v2.0.2
go: downloading github.com/xrash/smetrics v0.0.0-20201216005158-039620a65673
go: downloading github.com/russross/blackfriday/v2 v2.1.0
go: downloading github.com/go-faster/city v1.0.1
go: downloading github.com/go-faster/errors v0.6.1
go: downloading github.com/klauspost/compress v1.16.5
go: downloading github.com/pierrec/lz4/v4 v4.1.17
go: downloading github.com/segmentio/asm v1.2.0
go: downloading golang.org/x/sys v0.8.0
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
go: error obtaining buildID for go tool compile: fork/exec /usr/local/go/pkg/tool/linux_arm64/compile: invalid argument
To reproduce:
$ podman run -it --rm --platform linux/arm64 golang:1.23.1
# printf "package main\n\nimport \"fmt\"\n\nfunc main() {\n\tfmt.Println(\"Hello world\");\n}\n" > test.go
# go run ./test.go
EDIT: missing --platform linux/arm64
in the command (lost in the copy-paste)
@prattmic , looks like that patch has a conflict? Any idea if this will make it into go v1.23.2?
BTW, I tried updating qemu to a newer version on the Debian 12 image but that didn't seem to work either.
It will likely be in 1.23.2.
What is the newer version of QEMU that did not work?
Change https://go.dev/cl/612218 mentions this issue: [release-branch.go1.23] os: add clone(CLONE_PIDFD) check to pidfd feature check
Go version
go version 1.23.0 linux/arm64
Output of
go env
in your module/workspace:What did you do?
Our automated image build process fails to perform any step that invokes the
go
binary with the following error:The Dockerfile is here, and is being built via a script that invokes
docker buildx
with multiple platforms, like:It seems that there is something inherent in the qemu arm64 environment that renders
go
unable to fork itself to complete the telemetry setup. I'm fairly confident it's something specific to the 1.23 release as 1.22.6 builds successfully using the same setup today.What did you see happen?
Failures to invoke any
go
commandWhat did you expect to see?
A successful install and configuration of go 1.23.0 in a multi-arch docker build.