golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.41k stars 17.71k forks source link

runtime/trace: frame pointer unwinding during C-to-Go calls can crash on amd64 #59830

Closed nsrip-dd closed 1 year ago

nsrip-dd commented 1 year ago

While attempting to enable frame pointer unwinding for execution tracing on arm64, I ran the TestTraceUnwindCGO test. This test crashes on arm64 during frame pointer unwinding. The test, unaltered, does not crash on amd64. However, I think amd64 is susceptible to the same issue, namely that there's a trace event during a C-to-Go call which currently gets a call stack via frame pointer unwinding when it probably shouldn't.

We have the following logic in place to determine whether we're in a C-to-Go call:

https://github.com/golang/go/blob/715d53c090ea02dbd73c301684ecbd09b476989e/src/runtime/proc.go#L860-L862

If so, we fall back to regular unwinding, to avoid possibly following bad frame pointers coming from the C code. However, there is a point during the C-to-Go call where the "incgocallback" check is false, because incgo == true, yet we will still emit a trace event, during reentersyscall:

https://github.com/golang/go/blob/715d53c090ea02dbd73c301684ecbd09b476989e/src/runtime/cgocall.go#L241-L250

I can make TestTraceUnwindCGO crash on amd64 by deliberately putting junk in RBP and confirming that the tracer attempts to unwind through the junk address:

diff --git a/src/runtime/testdata/testprogcgo/trace_unix.c b/src/runtime/testdata/testprogcgo/trace_unix.c
index 0fa55c7215..220fe2d4e3 100644
--- a/src/runtime/testdata/testprogcgo/trace_unix.c
+++ b/src/runtime/testdata/testprogcgo/trace_unix.c
@@ -19,6 +19,7 @@ static void* cCalledFromCThread(void *p) {
 }

 void cCalledFromGo(void) {
+       asm ("mov $0xdeadbeef, %rbp\n");
        goCalledFromC();

        pthread_t thread;

Then go build ./testdata/testprogcgo, then gdb ./testprogcgo:

(gdb) r Trace
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/ec2-user/repos/go/src/runtime/testprogcgo Trace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffd03e2700 (LWP 18813)]
[New Thread 0x7fffcf9e1700 (LWP 18814)]
[New Thread 0x7fffcefe0700 (LWP 18815)]
[New Thread 0x7fffce5df700 (LWP 18816)]
[New Thread 0x7fffcdb9e700 (LWP 18817)]
[New Thread 0x7fffcd19d700 (LWP 18818)]

Thread 1 "testprogcgo" received signal SIGSEGV, Segmentation fault.
runtime.traceStackID (mp=<optimized out>, pcBuf=[]uintptr = {...}, skip=<optimized out>, ~r0=<optimized out>) at /home/ec2-user/repos/go/src/runtime/trace.go:916
916                             nstk += 1 + fpTracebackPCs(unsafe.Pointer(curgp.sched.bp), skip, pcBuf[2:])
(gdb) bt
#0  runtime.traceStackID (mp=<optimized out>, pcBuf=[]uintptr = {...}, skip=<optimized out>, ~r0=<optimized out>) at /home/ec2-user/repos/go/src/runtime/trace.go:916
#1  0x000000000045a865 in runtime.traceEventLocked (extraBytes=<optimized out>, mp=0x85f0e0 <runtime.m0>, pid=<optimized out>, bufp=0xc00002bbd0, ev=28 '\034', stackID=0, skip=1, 
    args=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at /home/ec2-user/repos/go/src/runtime/trace.go:759
#2  0x000000000045a569 in runtime.traceEvent (ev=28 '\034', skip=1, args=[]uint64) at /home/ec2-user/repos/go/src/runtime/trace.go:691
#3  0x000000000045d225 in runtime.traceGoSysCall () at /home/ec2-user/repos/go/src/runtime/trace.go:1573
#4  0x000000000046a42a in runtime.systemstack () at /home/ec2-user/repos/go/src/runtime/asm_amd64.s:509
#5  0x00007fffffffe268 in ?? ()
#6  0x000000000085f0e0 in ?? ()
#7  0x00007fffffffe250 in ?? ()
#8  0x00000000004ae39d in crosscall2 () at /home/ec2-user/repos/go/src/runtime/cgo/asm_amd64.s:30
#9  0x000000000051780e in goCalledFromC () at _cgo_export.c:212
(gdb) disas
... elided ...
=> 0x000000000045b297 <+535>:   mov    0x8(%rsi),%r11
   0x000000000045b29b <+539>:   mov    %r11,(%r9,%rax,8)
   0x000000000045b29f <+543>:   mov    (%rsi),%rsi
   0x000000000045b2a2 <+546>:   inc    %rax
   0x000000000045b2a5 <+549>:   cmp    %r10,%rax
(gdb) p/x $rsi
$2 = 0xdeadbeef

I think we need something like an incgocallback boolean that's set to be true for the entirety of cgocallbackg, or at least something that covers the reentersyscall call at the end of cgocallbackg.

cc @felixge @mknyszek @prattmic

gopherbot commented 1 year ago

Change https://go.dev/cl/488755 mentions this issue: runtime/trace: avoid frame pointer unwinding for events during cgocallbackg