golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.46k stars 17.59k forks source link

runtime: libunwind is unable to unwind CGo to Go's stack #40044

Open steeve opened 4 years ago

steeve commented 4 years ago

What version of Go are you using (go version)?

master as of the build

$ go version
devel +dd150176c3 Fri Jul 3 03:31:29 2020 +0000

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/steeve/Library/Caches/go-build"
GOENV="/Users/steeve/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/steeve/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/steeve/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/steeve/code/github.com/znly/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/steeve/code/github.com/znly/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/steeve/code/github.com/znly/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bs/51dlb_nn5k35xq9qfsxv9wc00000gr/T/go-build842228435=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Following @cherrymui's comment on #39524, I figured I tried to check why lots of our backtraces on iOS stop at runtime.asmcgocall. Since I wanted to reproduce it on my computer and lldb manges to properly backtrace, I figured I'd give libunwind a try, since this is was iOS uses when a program crashes.

Unfortunately libunwind didn't manage to walk the stack past CGo generated _Cfunc_ functions.

Given this program:

package main

/*
#cgo CFLAGS: -O0

#include <libunwind.h>
#include <stdio.h>

void backtrace() {
    unw_cursor_t cursor;
    unw_context_t context;

    // Initialize cursor to current frame for local unwinding.
    unw_getcontext(&context);
    unw_init_local(&cursor, &context);

    // Unwind frames one by one, going up the frame stack.
    while (unw_step(&cursor) > 0) {
        unw_word_t offset, pc;
        unw_get_reg(&cursor, UNW_REG_IP, &pc);
        if (pc == 0) {
            break;
        }
        printf("0x%llx:", pc);
        char sym[256];
        if (unw_get_proc_name(&cursor, sym, sizeof(sym), &offset) == 0) {
            printf(" (%s+0x%llx)\n", sym, offset);
        } else {
            printf(" -- error: unable to obtain symbol name for this frame\n");
        }
    }
}

void two() {
    printf("two\n");
    backtrace();
}

void one() {
    printf("one\n");
    two();
}
*/
import "C"

//go:noinline
func goone() {
    C.one()
}

func main() {
    goone()
}

It prints:

one1
two2
0x40617fe: (two+0x1e)
0x406182e: (one+0x1e)
0x406168b: (_cgo_7c45d1c2feef_Cfunc_one+0x1b)

I tried doing Go(1) -> C(1) -> Go(2) -> C(2) and backtrace, and it only unwinds C(2).

Also, I tried to make set asmcgocall to have a 16 bytes stack, hoping that the generated frame pointer would help, but it didn't.

What did you expect to see?

The complete backtrace.

What did you see instead?

A backtrace for C functions only.

cherrymui commented 4 years ago

This is different from #39524 . We switch stacks at Go/C boundaries. Go code runs on goroutine stacks (typically small), whereas C code runs on system stacks (typically large). Since they are not on the same stack, I would not expect any stack unwinding tool to work. Not sure if there is anything we could do.

Maybe we could use the frame pointer to "fake" it? Not sure this is a good idea...

steeve commented 4 years ago

Indeed. Thinking about it however, it doesn't feel that hard to do. Either by locally modifying asmcgocall, or, in a more ambitious way, via go:systemstack. Do you think that could work, at least in theory?

steeve commented 4 years ago

Also, weirdly enough, lldb is able to do it, without dwarf.

steeve commented 4 years ago

I am also realizing this is very different from #39524 indeed. But in some way, since runtime.libCCall uses asmcgocall, it could also allow for unwinding of gourtines blocked in things like pthread functions.

ianlancetaylor commented 4 years ago

If you want to unwind from C++ back into Go you may want to try github.com/ianlancetaylor/cgosymbolizer. Although that will only help from the Go side, not the C++ side.

In principle we could hand write unwind information for asmcgocall. The unwind information is basically DWARF, and it should be powerful enough to express what asmcgocall does.

steeve commented 4 years ago

@ianlancetaylor thank you. The issue, on the iOS side, is that unwinding is done locally, on the device (presumably with libunwind), without DWARF. DWARF is only added later to symbolicate the crashes.

That said, it could be useful for Android (which uses breakpad with minidumps)

@cherrymui I tried that forsaken piece of code to, in order to call the backtrace method without cgo, and alas, the unwinding still stops before it somehow. This is based on the rustgo article:

TEXT ·backtracetrampoline(SB),0,$2048
    MOVQ SP, BX        // Save SP in a callee-saved registry
    ADDQ $2048, SP     // Rollback SP to reuse this function's frame
    ANDQ $~15, SP      // Align the stack to 16-bytes
    CALL backtrace(SB)
    MOVQ BX, SP        // Restore SP
    RET
cherrymui commented 4 years ago

@steeve Sorry, I'm not sure exactly what you're planning to do, and why go:systemstack is relevant here. go:systemstack only enforces the marked function must run on the system stack (i.e. cannot run on a goroutine stack). It doesn't change how stack switches work.

Also, on what architecture? You mentioned iOS (presumably ARM64), but also AMD64 in your go env.

That said, does CL https://go-review.googlesource.com/c/go/+/241080 makes any difference (on ARM64)? Thanks.

steeve commented 4 years ago

@cherrymui Thank you for the CL, I wasn't hoping as much. Will definitely try and let you know.

My ultimate target is indeed iOS (and Android, to an extent). Before trying to fix it on iOS though, I figure it'd be easier to reproduce on my computer (darwin/amd64), and since iOS uses libunwind, try to investigate it myself. My other experiment in which I tried to call the method directly in the Go stack, is to try and narrow down if the frame pointer was trashed because of the stack switch itself.

gopherbot commented 4 years ago

Change https://golang.org/cl/241158 mentions this issue: runtime: adjust frame pointer on stack copy on ARM64

ianlancetaylor commented 4 years ago

@steeve libunwind unwinds the stack using the unwind information, which is not DWARF but is approximately the same format as a subset of DWARF. That's what I was referring to when I suggested that we could write unwind information for asmcgocall. (You can see the horrible details at https://www.airs.com/blog/archives/460).

steeve commented 4 years ago

I just tried your CL @cherrymui on a real device, and unfortunately, when I pause inside XCode's, I only see the stack up to asmcgocall.

In my case I did put a time.Sleep() and the backtrace in XCode's lldb was:

    frame #0: 0x00000001889f2bc0 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x00000001889151e4 libsystem_pthread.dylib`_pthread_cond_wait + 680
    frame #2: 0x000000010785d5e8 Zenly`runtime.pthread_cond_wait_trampoline + 24
    frame #3: 0x000000010785c4fc Zenly`runtime.asmcgocall + 204
    frame #4: 0x000000010785c4fc Zenly`runtime.asmcgocall + 204
  * frame #5: 0x000000010785c4fc Zenly`runtime.asmcgocall + 204

Note that on amd64, lldb manages to properly unwind.

qmuntal commented 1 year ago

The lack of frame pointer in asmcgocall is also breaking stack unwinding on WinDbg (using the prototype from #57302).

I've tried with gdb, and it's also broken:

Thread 1 hit Breakpoint 1, runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:835
835             MOVQ    (g_sched+gobuf_sp)(SI), SP
(gdb) backtrace
#0  runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:835
#1  0x0000000000402f49 in runtime.cgocall (fn=0x45a320 <runtime.asmstdcall>, arg=0x4daae0, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/cgocall.go:167
#2  0x0000000000455314 in syscall.loadsystemlibrary (filename=0xc00010c000, absoluteFilepath=<optimized out>, handle=<optimized out>, err=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/syscall_windows.go:443
#3  0x000000000045b39c in syscall.loadsystemlibrary (filename=0x45a320 <runtime.asmstdcall>, absoluteFilepath=0x4daae0, handle=<optimized out>, err=<optimized out>) at <autogenerated>:1
... omitted
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) si
runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:840
840             SUBQ    $64, SP
(gdb) backtrace
#0  runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:840
#1  0x00000000007afee8 in ?? ()

Also, I tried to make set asmcgocall to have a 16 bytes stack, hoping that the generated frame pointer would help, but it didn't.

@steeve curiously, WinDbg and gdb can unwind the stack if asmcgocall stack is incremented to 16 bytes, which makes the assembler to introduce a frame pointer:

#0  runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:811
#1  0x000000000042d3a5 in runtime.stdcall (fn=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1074
#2  0x000000000042d4bc in runtime.stdcall1 (fn=0x7ffca80d95d0 <LoadLibraryA>, a0=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1095
#3  0x000000000042ab85 in runtime.loadOptionalSyscalls () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:249
#4  0x000000000042b9b5 in runtime.osinit () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:551
#5  0x0000000000456637 in runtime.rt0_go () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:348
#6  0x00000000a80d26bd in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) si
812             MOVQ    arg+8(FP), BX
(gdb) backtrace
#0  runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:812
#1  0x000000000042d3a5 in runtime.stdcall (fn=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1074
#2  0x000000000042d4bc in runtime.stdcall1 (fn=0x7ffca80d95d0 <LoadLibraryA>, a0=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1095
#3  0x000000000042ab85 in runtime.loadOptionalSyscalls () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:249
#4  0x000000000042b9b5 in runtime.osinit () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:551
#5  0x0000000000456637 in runtime.rt0_go () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:348
#6  0x00000000a80d26bd in ?? ()
qmuntal commented 1 year ago

@ianlancetaylor @cherrymui asmcgocall does not have a frame pointer even it being a non-leaf function due to condition 2 in this code:

https://github.com/golang/go/blob/1ba7341cb21d9edb2a04eb0b24b3af71899b35fc/src/cmd/internal/obj/x86/obj6.go#L609-L626

asmcgocall seems to be doing well with a frame pointer, so it does not fit in the scary-runtime-internals category (at least on Windows, have to try other OSes). IMO we should get rid of that heuristic and just use NOFRAME if a runtime function would be messed up be frame pointer, just as it is happening on arm and arm64.

gopherbot commented 1 year ago

Change https://go.dev/cl/459395 mentions this issue: runtime: use explicit NOFRAME on windows/amd64

cherrymui commented 1 year ago

IMO we should get rid of that heuristic and just use NOFRAME if a runtime function would be messed up be frame pointer

I think that is a good direction. Thanks for looking into it. Are we sure what matters are all assembly functions? We don't have explicit NOFRAME control for compiled functions.

I'm still not sure about stack transition in asmcgocall being "broken". Technically, it is running on two stacks -- the C functions run on a different stack. So if we are unwinding the physical stack, you shouldn't see both Go and C functions. You could argue we're expecting to unwind the logical stack. That is a reasonable argument. But I don't think a decision has been made for whether the unwinding should be the physical stack or the logical stack. Further, at least some debugger is not happy if the stack pointer suddenly changes direction. I don't think it is a good idea if the unwinding only works when the C stack is at a higher address than the Go stack, given that we don't generally control where the stacks are in the address space.

For a (not quite accurate) analogy, what does the debugger do for C longjmp or setcontext or other stack transitions? Does it unwind through it? What machinery does it use?

Thanks.

ianlancetaylor commented 1 year ago

The C longjmp and setcontext functions change the stack entirely and irrevocably (modulo another call to longjmp or setcontext, of course). They are unlike asmcgocall, which switches to a temporary stack for the duration of a function call and then returns to the original stack. So I don't think the same issues arise. For the C functions you get either the original call stack or the new call stack.

Note that we do support unwinding the cgo stack to the Go stack via the runtime.SetCgoTraceback function. However, that only supports tracebacks that use Go functions (runtime.Callers, runtime.Stack). It does not help with libunwind.

What would work for libunwind is for us to write unwind information for asmcgocall that tells libunwind how to load the stack and PC of the calling frame. This is feasible, as the unwind information can use arbitrary expressions, but complicated.

cherrymui commented 1 year ago

I agree that longjmp is not really a good analogy because it could be an actual context switch (although it could be used to implement a temporary stack transition like asmcgocall, but the tools never know). I don't think there is anything in C that does a temporary stack switch?

I think the Go traceback API is mostly showing only the "user frames", e.g. we hide compiler-generated wrappers. So hiding the asmcgocall stack transition when SetCgoTraceback is set seems also reasonable. But I'm not sure we should blindly hide stack transitions for other low-level unwinders.

qmuntal commented 1 year ago

But I'm not sure we should blindly hide stack transitions for other low-level unwinders.

I don't expect Go to hide wrappers and stack transitions to external unwinders, if it is possible at all.

I do expect Go to facilitate unwinding stack transitions, even from C to Go, and vice versa. This is certainly doable using Windows' SEH if a frame pointer is set in the function prologue and the linker emits the appropriate metadata.

qmuntal commented 1 year ago

@donatocayurin please use reactions to show support for an issue instead of posting a new comment with just an emoji. This way we keep the conversation history cleaner.