golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.87k stars 17.52k forks source link

runtime: frame pointer stored past the stack pointer on arm64, resulting in inaccurate backtraces in LLDB #69032

Open smoofra opened 3 weeks ago

smoofra commented 3 weeks ago

Go version

go version go1.22.6 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/larry/Library/Caches/go-build'
GOENV='/Users/larry/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/larry/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/larry/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.22.6/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.22.6/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.6'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/Users/larry/src/go/src/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/zs/c12p66m91mq_vm_88c68lwh00000gp/T/go-build3146866747=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Built a hello world program.

➜  ~ go build -o hello hello.go 

run it in a debugger.

➜  ~ lldb hello
(lldb) target create "hello"
Current executable set to '/Users/larry/hello' (arm64).

Breakpoint on a system function which is called by the go runtime before main.

(lldb) b notify_is_valid_token 
Breakpoint 1: where = libsystem_notify.dylib`notify_is_valid_token, address = 0x0000000183ce75e8
(lldb) r
Process 82185 launched: '/Users/larry/hello/hello' (arm64)
Process 82185 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001889b75e8 libsystem_notify.dylib`notify_is_valid_token
Target 0: (hello) stopped.

Take a backtrace

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001889b75e8 libsystem_notify.dylib`notify_is_valid_token
    frame #1: 0x0000000100064db4 hello`runtime.osinit_hack_trampoline.abi0 + 20
    frame #2: 0x00000001000637f8 hello`runtime.asmcgocall.abi0 + 200
    frame #3: 0x00000001000523fc hello`runtime.libcCall + 92

Take a look at osinit_hack_trampoline. It saves the frame pointer past the end of the stack.

(lldb) up
frame #1: 0x0000000100064db4 hello`runtime.osinit_hack_trampoline.abi0 + 20
(lldb) disas
hello`runtime.osinit_hack_trampoline.abi0:
    0x100064da0 <+0>:  str    x30, [sp, #-0x10]!
    0x100064da4 <+4>:  stur   x29, [sp, #-0x8]
    0x100064da8 <+8>:  sub    x29, sp, #0x8
    0x100064dac <+12>: mov    x0, xzr
    0x100064db0 <+16>: bl     0x10008ce2c              
->  0x100064db4 <+20>: bl     0x10008ce38               
    0x100064db8 <+24>: ldp    x29, x30, [sp, #-0x8]
    0x100064dbc <+28>: add    sp, sp, #0x10
    0x100064dc0 <+32>: ret    
    0x100064dc4 <+36>: udf    #0x0
    0x100064dc8 <+40>: udf    #0x0
    0x100064dcc <+44>: udf    #0x0

Take a look at the C function it calls. It's function prelude clobbers the frame pointer that go's prelude saved.

(lldb) down
frame #0: 0x00000001889b75e8 libsystem_notify.dylib`notify_is_valid_token
libsystem_notify.dylib`notify_is_valid_token:
(lldb) disas
libsystem_notify.dylib`notify_is_valid_token:
->  0x1889b75e8 <+0>:   pacibsp 
    0x1889b75ec <+4>:   sub    sp, sp, #0x70
    0x1889b75f0 <+8>:   stp    x22, x21, [sp, #0x40]
    0x1889b75f4 <+12>:  stp    x20, x19, [sp, #0x50]
    0x1889b75f8 <+16>:  stp    x29, x30, [sp, #0x60]
    0x1889b75fc <+20>:  add    x29, sp, #0x60
    0x1889b7600 <+24>:  mov    x19, x0
    0x1889b7604 <+28>:  adrp   x8, 398022
    0x1889b7608 <+32>:  ldr    x8, [x8, #0x320]
    0x1889b760c <+36>:  ldr    x8, [x8]
   ....

Continue past the prelude.

(lldb) b -a 0x1889b7600
Breakpoint 2: where = libsystem_notify.dylib`notify_is_valid_token + 24, address = 0x00000001889b7600
(lldb) c
Process 82185 resuming
Process 82185 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
    frame #0: 0x00000001889b7600 libsystem_notify.dylib`notify_is_valid_token + 24
libsystem_notify.dylib`notify_is_valid_token:
->  0x1889b7600 <+24>: mov    x19, x0
    0x1889b7604 <+28>: adrp   x8, 398022
    0x1889b7608 <+32>: ldr    x8, [x8, #0x320]
    0x1889b760c <+36>: ldr    x8, [x8]
Target 0: (hello) stopped.

Now the backtrace is no longer valid. It has lost frames because the FP got clobbered.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
  * frame #0: 0x00000001889b7600 libsystem_notify.dylib`notify_is_valid_token + 24
    frame #1: 0x0000000100064db4 hello`runtime.osinit_hack_trampoline.abi0 + 20
(lldb) 

What did you see happen?

Backtraces were not accurate because go's function prelude stored the frame pointer past the edge of the stack.

What did you expect to see?

Accurate backtraces.

gabyhelp commented 3 weeks ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

cherrymui commented 3 weeks ago

The ARM64 calling convention is to store the frame pointer below the link register. As the LR is stored at 0(SP), we had to store FP at -8(SP). It is unfortunate, but we cannot change the calling convention at this point.

However, we probably can change the libc function trampolines to use a different convention. They already don't follow the Go calling convention and the Go stack unwinder probably never looks at them.

smoofra commented 3 weeks ago

Yea, but couldn't the (FP, LR) pair be stored at the end (highest address) of the stack frame, like a normal C function would do? That wouldn't alter the calling convention would it?

cherrymui commented 3 weeks ago

couldn't the (FP, LR) pair be stored at the end (highest address) of the stack frame

User assembly code depend on LR saved at 0(SP). We can't change it now.

Note that for regular Go stacks, it is also unnecessary: Go functions know this and would not write to the very top word in the frame to clobber the caller's FP. The only time it matters is to call C functions.

smoofra commented 3 weeks ago

Ugh, that's unfortunate.

Well if y'all ever make an ABI1 I guess it could be fixed there :/

smoofra commented 3 weeks ago

Note that backtraces will be also be wrong when taken with sample, stackshot, ktrace, Instruments, or just about any debugging or perf tool on Mac OS.

randall77 commented 2 weeks ago

We could conceivably change ABIinternal and leave things as is for ABI0. Not sure how tricky that would be.

smoofra commented 2 weeks ago

It seems to me that the thing blocking fixing this is not the calling convention, but the expectations of assembly code.

couldn't we add a flag to the TEXT declaration similar to NOFRAME, that informs the assembler that it is allowed to place the FP,LR pair in the correct place? The resulting code would still be valid ABI0, and also valid according to platform backtracers. functions without the flag would get the old behavior, so nothing should break.

cherrymui commented 2 weeks ago

I don't think we need to change the convention for regular Go functions and Go assembly code. The system ABI requires FP be saved at a word below the LR, but it does not require it to be saved at a specific place in the stack frame, except the general assumption that the space below SP is scratch. Most tools generally don't care about the SP. Go generated code already doesn't clobber the saved frame pointer. It only matters when calling into C code, which can be handled specifically.

smoofra commented 2 weeks ago

Yes, the system ABI does not require the FP,LR be saved at a specific place, but it does require that it be saved inside the functions actual stack frame, not in the red zone. The red zone can be used for scratch, yes, but it must be assumed to be clobbered every time a function is called.

Yes, it only matters when caling into C (or other non-go) code. Another way to mitigate would be to adjust SP down by 8 before calling any C function.