llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.1k stars 12.01k forks source link

LLDB loses breakpoints across `run` on MacOS for externally-linked Go binaries #66013

Open prattmic opened 1 year ago

prattmic commented 1 year ago

Breakpoints set in a Go program prior to the initial run never trigger. Similarly if the program is running, any set breakpoints are lost (i.e., never fire, they are still listed in breakpoint list) across another run.

This seems to only apply to "externally-linked" Go binaries. By default, Go binaries are "internally-linked", meaning Go's linker does the full final linking of the binary. In some cases, most notably with applications using cgo (C FFI from Go), binaries are "externally-linked", meaning that we pass objects to the system linker to do the final link. It does not reproduce on Linux.

I have tested this with:

Apologies I have not tested this with HEAD LLVM, as I don't have a build environment set up on this MacOS dev machine, but I can get it set up if necessary.

Reproducer

Setup

If you don't have Go, you can fetch and extract a tar from https://go.dev/dl. The go binary below is at ./go/bin/go in the tar.

$ mkdir /tmp/example
$ cd /tmp/example
$ go mod init example
$ cat > main.go <<EOF
package main

import "fmt"

func main() {
        fmt.Println("Hello World!")
}
EOF

Working version ("internally-linked")

$ go build
$ lldb ./example
(lldb) target create "./example"
Current executable set to '/tmp/buildlet/lldb/example' (x86_64).
(lldb) breakpoint set -r main.main
Breakpoint 1: where = example`main.main, address = 0x0000000001086e00
(lldb) run
Process 9165 launched: '/tmp/buildlet/lldb/example' (x86_64)
Process 9165 stopped
* thread #1, stop reason = breakpoint 1.1
    frame #0: 0x0000000001086e00 example`main.main
example`main.main:
->  0x1086e00 <+0>: cmpq   0x10(%r14), %rsp
    0x1086e04 <+4>: jbe    0x1086e52                 ; <+82>
    0x1086e06 <+6>: pushq  %rbp
    0x1086e07 <+7>: movq   %rsp, %rbp
Target 0: (example) stopped.

(Note breakpoint set -r here. Simple b main.main doesn't properly find the main.main symbol for some reason. But that is not this bug.)

Broken version ("externally-linked")

$ go build "-ldflags=-linkmode=external -v"
# example
HEADER = -H1 -T0x1001000 -R0x1000
host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-no_pie" "-o" "/var/folders/wh/9yc0j5w50z97w3528z_7qvr40000gn/T/go-build3740925102/b001/exe/a.out" "-Qunused-arguments" "/var/folders/wh/9yc0j5w50z97w3528z_7qvr40000gn/T/go-link-2866632490/go.o" "-lresolv" "-no-pie"
90105 symbols, 25881 reachable
        40055 package symbols, 34120 hashed symbols, 12116 non-package symbols, 3814 external symbols
92049 liveness data

Note: -v is not required, but it shows how we invoke clang for final link.

Note: Typically a user would end up with external linking automatically by using cgo. Invoking it directly keeps this repro simpler.

$ lldb ./example
(lldb) target create "./example"
Current executable set to '/tmp/buildlet/lldb/example' (x86_64).
(lldb) breakpoint set -r main.main
Breakpoint 1: where = example`main.main, address = 0x00000001000892c0
(lldb) run
Process 9215 launched: '/tmp/buildlet/lldb/example' (x86_64)
Hello World!
Process 9215 exited with status = 0 (0x00000000) 

The breakpoint never fired. Running process launch --stop-at-start and then setting the breakpoint is not sufficient either. What does work is waiting for the dynamic linker to finish loading and then set the breakpoint:

(lldb) b dyld`start
Breakpoint 2: where = dyld`start, address = 0x00007ff814000990
(lldb) run
Process 9225 launched: '/tmp/buildlet/lldb/example' (x86_64)
Process 9225 stopped
* thread #1, stop reason = breakpoint 2.1
    frame #0: 0x0000000100184990 dyld`start
dyld`start:
->  0x100184990 <+0>: pushq  %rbp
    0x100184991 <+1>: movq   %rsp, %rbp
    0x100184994 <+4>: pushq  %r15
    0x100184996 <+6>: pushq  %r14
Target 0: (example) stopped.
(lldb) disassemble
<snip long output>
<look for the `call *%r15` near the end of the function where `start` calls into the loaded application, and set a breakpoint there>
(lldb) b 0x10018530d
Breakpoint 3: where = dyld`start + 2429, address = 0x000000010018530d
(lldb) c
Process 9225 resuming
Process 9225 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1
    frame #0: 0x00007ff81400130d dyld`start + 2429
dyld`start:
->  0x7ff81400130d <+2429>: callq  *%r15
    0x7ff814001310 <+2432>: movl   %eax, %ebx
    0x7ff814001312 <+2434>: movq   0x8(%r14), %rax
    0x7ff814001316 <+2438>: movl   0x44(%rax), %edi
Target 0: (example) stopped.
(lldb) breakpoint set -r main.main
Breakpoint 4: where = example`main.main, address = 0x00000001000892c0
(lldb) c
Process 9225 resuming
Process 9225 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
    frame #0: 0x00000001000892c0 example`main.main
example`main.main:
->  0x1000892c0 <+0>: cmpq   0x10(%r14), %rsp
    0x1000892c4 <+4>: jbe    0x100089312               ; <+82>
    0x1000892c6 <+6>: pushq  %rbp
    0x1000892c7 <+7>: movq   %rsp, %rbp
Target 0: (example) stopped.

At this point if I run run again, none of the breakpoints will work except for the dyld'start breakpoint, and I will have to go through the same process. Even the explicit PC breakpoint at the end of dyld'start stops working, even though the PC does not change across runs (The Go PCs don't change either).

Clearly there is some difference in our internal vs external linking that is confusing lldb, but I'm not sure where it might be. In fact, I'm rather surprised that it is the internally-linked one that works and the externally-linked is broken. I would have expected the opposite.

Both the internally and externally-linked binaries are dynamically linked.

llvmbot commented 1 year ago

@llvm/issue-subscribers-lldb

Breakpoints set in a Go program prior to the initial `run` never trigger. Similarly if the program is running, any set breakpoints are lost (i.e., never fire, they are still listed in `breakpoint list`) across another `run`. This seems to only apply to "externally-linked" Go binaries. By default, Go binaries are "internally-linked", meaning Go's linker does the full final linking of the binary. In some cases, most notably with applications using cgo (C FFI from Go), binaries are "externally-linked", meaning that we pass objects to the system linker to do the final link. It does not reproduce on Linux. I have tested this with: * Go 1.21.1 * `lldb -v`: ``` lldb-1400.0.38.13 Apple Swift version 5.7.1 (swiftlang-5.7.1.135.3 clang-1400.0.29.51) ``` * `ld -v`: ``` @(#)PROGRAM:ld PROJECT:ld64-820.1 BUILD 18:42:34 Sep 11 2022 configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em LTO support using: LLVM version 14.0.0, (clang-1400.0.29.202) (static support for 29, runtime is 29) TAPI support using: Apple TAPI version 14.0.0 (tapi-1400.0.11) ``` * MacOS 13.0 Apologies I have not tested this with HEAD LLVM, as I don't have a build environment set up on this MacOS dev machine, but I can get it set up if necessary. # Reproducer ## Setup If you don't have Go, you can fetch and extract a tar from https://go.dev/dl. The `go` binary below is at `./go/bin/go` in the tar. ``` $ mkdir /tmp/example $ cd /tmp/example $ go mod init example $ cat > main.go <<EOF package main import "fmt" func main() { fmt.Println("Hello World!") } EOF ``` ## Working version ("internally-linked") ``` $ go build $ lldb ./example (lldb) target create "./example" Current executable set to '/tmp/buildlet/lldb/example' (x86_64). (lldb) breakpoint set -r main.main Breakpoint 1: where = example`main.main, address = 0x0000000001086e00 (lldb) run Process 9165 launched: '/tmp/buildlet/lldb/example' (x86_64) Process 9165 stopped * thread #1, stop reason = breakpoint 1.1 frame #0: 0x0000000001086e00 example`main.main example`main.main: -> 0x1086e00 <+0>: cmpq 0x10(%r14), %rsp 0x1086e04 <+4>: jbe 0x1086e52 ; <+82> 0x1086e06 <+6>: pushq %rbp 0x1086e07 <+7>: movq %rsp, %rbp Target 0: (example) stopped. ``` (Note `breakpoint set -r` here. Simple `b main.main` doesn't properly find the `main.main` symbol for some reason. But that is not this bug.) ## Broken version ("externally-linked") ``` $ go build "-ldflags=-linkmode=external -v" # example HEADER = -H1 -T0x1001000 -R0x1000 host link: "clang" "-arch" "x86_64" "-m64" "-Wl,-headerpad,1144" "-Wl,-no_pie" "-o" "/var/folders/wh/9yc0j5w50z97w3528z_7qvr40000gn/T/go-build3740925102/b001/exe/a.out" "-Qunused-arguments" "/var/folders/wh/9yc0j5w50z97w3528z_7qvr40000gn/T/go-link-2866632490/go.o" "-lresolv" "-no-pie" 90105 symbols, 25881 reachable 40055 package symbols, 34120 hashed symbols, 12116 non-package symbols, 3814 external symbols 92049 liveness data ``` Note: `-v` is not required, but it shows how we invoke `clang` for final link. Note: Typically a user would end up with external linking automatically by using cgo. Invoking it directly keeps this repro simpler. ``` $ lldb ./example (lldb) target create "./example" Current executable set to '/tmp/buildlet/lldb/example' (x86_64). (lldb) breakpoint set -r main.main Breakpoint 1: where = example`main.main, address = 0x00000001000892c0 (lldb) run Process 9215 launched: '/tmp/buildlet/lldb/example' (x86_64) Hello World! Process 9215 exited with status = 0 (0x00000000) ``` The breakpoint never fired. Running `process launch --stop-at-start` and then setting the breakpoint is not sufficient either. What does work is waiting for the dynamic linker to finish loading and then set the breakpoint: ``` (lldb) b dyld`start Breakpoint 2: where = dyld`start, address = 0x00007ff814000990 (lldb) run Process 9225 launched: '/tmp/buildlet/lldb/example' (x86_64) Process 9225 stopped * thread #1, stop reason = breakpoint 2.1 frame #0: 0x0000000100184990 dyld`start dyld`start: -> 0x100184990 <+0>: pushq %rbp 0x100184991 <+1>: movq %rsp, %rbp 0x100184994 <+4>: pushq %r15 0x100184996 <+6>: pushq %r14 Target 0: (example) stopped. (lldb) disassemble <snip long output> <look for the `call *%r15` near the end of the function where `start` calls into the loaded application, and set a breakpoint there> (lldb) b 0x10018530d Breakpoint 3: where = dyld`start + 2429, address = 0x000000010018530d (lldb) c Process 9225 resuming Process 9225 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1 frame #0: 0x00007ff81400130d dyld`start + 2429 dyld`start: -> 0x7ff81400130d <+2429>: callq *%r15 0x7ff814001310 <+2432>: movl %eax, %ebx 0x7ff814001312 <+2434>: movq 0x8(%r14), %rax 0x7ff814001316 <+2438>: movl 0x44(%rax), %edi Target 0: (example) stopped. (lldb) breakpoint set -r main.main Breakpoint 4: where = example`main.main, address = 0x00000001000892c0 (lldb) c Process 9225 resuming Process 9225 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1 frame #0: 0x00000001000892c0 example`main.main example`main.main: -> 0x1000892c0 <+0>: cmpq 0x10(%r14), %rsp 0x1000892c4 <+4>: jbe 0x100089312 ; <+82> 0x1000892c6 <+6>: pushq %rbp 0x1000892c7 <+7>: movq %rsp, %rbp Target 0: (example) stopped. ``` At this point if I run `run` again, none of the breakpoints will work except for the `dyld'start` breakpoint, and I will have to go through the same process. Even the explicit PC breakpoint at the end of `dyld'start` stops working, even though the PC does not change across runs (The Go PCs don't change either). Clearly there is some difference in our internal vs external linking that is confusing lldb, but I'm not sure where it might be. In fact, I'm rather surprised that it is the internally-linked one that works and the externally-linked is broken. I would have expected the opposite. Both the internally and externally-linked binaries are dynamically linked.