golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.11k stars 17.68k forks source link

runtime: endless sigtramp in the stack of coredump generated by panic on arm64 #58998

Open ppggff opened 1 year ago

ppggff commented 1 year ago

What version of Go are you using (go version)?

$ go version
go version go1.20.2 linux/arm64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

CentOS Linux release 7.9.2009 (AltArch) (qemu on apple m2 host)

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/gpadmin/.cache/go-build"
GOENV="/home/gpadmin/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/gpadmin/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/gpadmin/go"
GOPRIVATE=""
GOPROXY="https://goproxy.io"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="go1.20.2"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2696082919=/tmp/go-build -gno-record-gcc-switches"

What did you do?

test:

package main

import (
"fmt"
"runtime/debug"
)

func main() {
    debug.SetTraceback("crash")
    fmt.Println("xxx")
    panic("xxx")
}

then build it, run it, and check the coredump with dlv:

go build test.go
./test
dlv core ./test /tmp/core-test-6-1000-1000-4693-1678678851

What did you expect to see?

normal stack

What did you see instead?

endless runtime.sigtramp, for example:

dlv core ./test /tmp/core-test-6-1000-1000-4693-1678678851

 0  0x000000000006c298 in runtime.raise
    at /usr/local/go/src/runtime/sys_linux_arm64.s:158
 1  0x0000000000054094 in runtime.dieFromSignal
    at /usr/local/go/src/runtime/signal_unix.go:879
 2  0x0000000000054598 in runtime.sigfwdgo
    at /usr/local/go/src/runtime/signal_unix.go:1092
 3  0x00000000000530ac in runtime.sigtrampgo
    at /usr/local/go/src/runtime/signal_unix.go:432
 4  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
 5  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
 6  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
 7  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
 8  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
 9  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
10  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
11  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
12  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
13  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
14  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
15  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
16  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
17  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
18  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
19  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
20  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
21  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
22  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
23  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
24  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
25  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
26  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
27  0x000000000006c644 in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_arm64.s:467
mknyszek commented 1 year ago

In triage, we think this might be a bug in Delve, but it could be something in our DWARF information.

CC @aarzilli

aarzilli commented 1 year ago

Will look into it.

aarzilli commented 1 year ago

As far as I can tell it isn't delve, gdb can't get past runtime.sigtramp either:

(gdb) bt
#0  runtime.raise () at /home/ubuntu/goroot120/src/runtime/sys_linux_arm64.s:158
#1  0x0000000000054094 in runtime.dieFromSignal (sig=6) at /home/ubuntu/goroot120/src/runtime/signal_unix.go:879
#2  0x0000000000054598 in runtime.sigfwdgo (sig=6, info=<optimized out>, ctx=<optimized out>, ~r0=<optimized out>)
    at /home/ubuntu/goroot120/src/runtime/signal_unix.go:1092
#3  0x00000000000530ac in runtime.sigtrampgo (sig=0, info=0x11c8, ctx=0x6)
    at /home/ubuntu/goroot120/src/runtime/signal_unix.go:432
#4  0x000000000006c644 in runtime.sigtramp () at /home/ubuntu/goroot120/src/runtime/sys_linux_arm64.s:467
(gdb)

Gdb has a bunch of special code to get past sigtramp but I think it only works with glibc's sigtramp. One way to do this would be to cheat and add a workaround on delve's side and load the registers from the ctx argument of runtime.sigtrampgo. Another way would be to change the linker to emit a DFE for runtime.sigtrampgo (or runtime.sigtramp) that instructs the debugger to load registers from the ctx argument. Both are annoying because they have to be done for every architecture, doing it in delve would be easier purely because it supports fewer (but it isn't going to make gdb work).

ianlancetaylor commented 1 year ago

For what it's worth, in runtime/sys_linux_amd64.s we carefully set up the signal handler so that gdb and gcc could backtrace through it: https://go.googlesource.com/go/+/refs/heads/master/src/runtime/sys_linux_amd64.s#460.

On arm64 the equivalent instructions are

 0xd2801168         movz x8, #0x8b
 0xd4000001         svc  0x0

(from https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/config/aarch64/linux-unwind.h;h=00eba866049b5b08deb90b840c3a52f6f52968a1;hb=HEAD#l68).

Might be interesting to change sigreturn in runtime/sys_linux_arm64.s to precisely that and see whether that fixes gdb.