golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.83k stars 17.65k forks source link

runtime/cgo: bad debug_frame entry for crosscall2 #21569

Open aarzilli opened 7 years ago

aarzilli commented 7 years ago
go version go1.9rc1 linux/amd64
go env:
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/a/n/go/"
GORACE=""
GOROOT="/usr/local/go19rc1"
GOTOOLDIR="/usr/local/go19rc1/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build839946921=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

The debug_frame entry for crosscall2 is wrong in any cgo program.

This is crosscall2:

  asm_amd64.s:12    0x457cc0        4883ec58        SUBQ $0x58, SP      
  asm_amd64.s:16    0x457cc4        48895c2418      MOVQ BX, 0x18(SP)   
  asm_amd64.s:17    0x457cc9        48896c2420      MOVQ BP, 0x20(SP)   
  asm_amd64.s:18    0x457cce        4c89642428      MOVQ R12, 0x28(SP)  
  asm_amd64.s:19    0x457cd3        4c896c2430      MOVQ R13, 0x30(SP)  
  asm_amd64.s:20    0x457cd8        4c89742438      MOVQ R14, 0x38(SP)  
  asm_amd64.s:21    0x457cdd        4c897c2440      MOVQ R15, 0x40(SP)  
  asm_amd64.s:57    0x457ce2        48893424        MOVQ SI, 0(SP)      
  asm_amd64.s:58    0x457ce6        4889542408      MOVQ DX, 0x8(SP)    
  asm_amd64.s:59    0x457ceb        48894c2410      MOVQ CX, 0x10(SP)   
  asm_amd64.s:61    0x457cf0        ffd7            CALL DI         
  asm_amd64.s:64    0x457cf2        488b5c2418      MOVQ 0x18(SP), BX   
  asm_amd64.s:65    0x457cf7        488b6c2420      MOVQ 0x20(SP), BP   
  asm_amd64.s:66    0x457cfc        4c8b642428      MOVQ 0x28(SP), R12  
  asm_amd64.s:67    0x457d01        4c8b6c2430      MOVQ 0x30(SP), R13  
  asm_amd64.s:68    0x457d06        4c8b742438      MOVQ 0x38(SP), R14  
  asm_amd64.s:69    0x457d0b        4c8b7c2440      MOVQ 0x40(SP), R15  
  asm_amd64.s:72    0x457d10        4883c458        ADDQ $0x58, SP      
  asm_amd64.s:76    0x457d14        c3          RET         

The debug_info entry:

 <1><169bb>: Abbrev Number: 2 (DW_TAG_subprogram)
    <169bc>   DW_AT_name        : crosscall2
    <169c7>   DW_AT_low_pc      : 0x457cc0
    <169cf>   DW_AT_high_pc     : 0x457d15
    <169d7>   DW_AT_frame_base  : 1 byte block: 9c      (DW_OP_call_frame_cfa)
    <169d9>   DW_AT_external    : 1
 <2><169da>: Abbrev Number: 0

the debug_frame CIE:

00000000 0000000000000010 ffffffff CIE
  Version:               3
  Augmentation:          ""
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 16

  DW_CFA_def_cfa: r7 (rsp) ofs 8
  DW_CFA_offset_extended: r16 (rip) at cfa-8
  DW_CFA_nop

and the debug_frame entry for crosscall2:

0000b4d4 000000000000001c 00000000 FDE cie=00000000 pc=0000000000457cc0..0000000000457d15
  DW_CFA_def_cfa_offset_sf: 8
  DW_CFA_advance_loc1: 84 to 0000000000457d14
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop
  DW_CFA_nop

the cfa offset is 8, it should be 0x58 + 0x8 after the first instruction.

heschi commented 7 years ago

Brief investigation:

.debug_frame is generated from FuncInfo.Pcsp, which is populated in linkpcln using pctospadj, which reads Spadj from Progs, which is populated by the CPU-specific assembler, mostly in preprocess. For X86, it understands explicit push/pop instructions, but not the SUBQ ..., SP that crosscall2 does. I don't know enough assembly to fill a full list of every instruction that could be used to manipulate the stack pointer, but it seems straightforward to fix this specific case for x86 at least.

aarzilli commented 7 years ago

I see that crosscall2 is defined in src/runtime/cgo/asm_*.s, why do all of them do the SUB manually instead of declaring a frame size in the TEXT header? I can see why the one for amd64 would do that (because it has different frame sizes on windows and linux) but why do all the others do it too?

ianlancetaylor commented 7 years ago

There is probably no special reason. It's probably worth trying setting up the frame in the TEXT pseudo-op.

heschi commented 7 years ago

More notes from the clueless (me):

A naive attempt at this fails because the assembler doesn't like unbalanced stack operations, and there are a few functions in the runtime that aren't balanced, like runtime.rt0_go which doesn't bother cleaning up because it never returns. These are fixable, sort of, though it's sort of weird to cater to the subset of instructions I've bothered implementing. We could also just disable the error.

Either way, it points to a bigger question: how realistic is it to expect the assembler to figure this out for an arbitrary function? For example, asmcgocall is switching to the system stack before making a cgo call. I presume the assembler can't automatically generate the right information for something that weird.

It seems that GNU as has directives for emitting call frame information. Without implementing something like that, and using it everywhere, it seems unlikely we'll ever emit perfect .debug_frame information for assembly functions.

@aarzilli: Can you explain a little more why you need this? If all you're trying to do is unwind the stack, aren't the frame pointers supposed to be enough?

aarzilli commented 7 years ago

Frame pointers would be enough if there was no FDE for crosscall2.

Without implementing something like that, and using it everywhere, it seems unlikely we'll ever emit perfect .debug_frame information for assembly functions.

True, but crosscall2 isn't weird like asmcgocall.

heschi commented 7 years ago

True, but crosscall2 isn't weird like asmcgocall.

Sure. But it may show up in a stack trace Delve cares about, e.g. a program that goes Go -> C -> Go, which isn't unheard of. So I think it pays to worry about it.

Dropping the FDEs for assembly functions feels a lot safer and more future-proof than trying to massage debugger-relevant functions to fit into the subset that the assembler understands. Let me see how that would work.

aarzilli commented 7 years ago

That seems like a bit of a overreaction. Most assembly functions are called on goroutine stacks and therefore need to have correct unwind information, otherwise the garbage collector wouldn't work. Even asmcgocall and cgocallback_gofunc are correct at their safe points as long as you limit the unwind to the goroutine stack. crosscall2 is getting away with being wrong because it lives on the cgo stack only.

gopherbot commented 7 years ago

Change https://golang.org/cl/70937 mentions this issue: cgo: fix FDE of crosscall2 on amd64 and 386

heschi commented 7 years ago

Fixed? Or do we want to try to get the other architectures done too?

ianlancetaylor commented 7 years ago

The other architectures matters too, yes.

rsc commented 6 years ago

The x86 fix does not apply to the other architectures, because the assembly in crosscall2 needs to behave like a C function, and using the standard TEXT-induced prologue makes the code behave like a Go function. Those happen to be close enough to the same on x86, but they are not on the other architectures (they differ in where the link register is saved, for example).

I am not convinced this bug is worth fixing on the other architectures. Why does crosscall2's debug info need to be correct at all?

rsc commented 6 years ago

If there is anything left for the other architectures, it will need to happen in a future cycle. (One option would be to add a NODEBUG bit that disables generation of debug info for a function.)

aarzilli commented 6 years ago

Why does crosscall2's debug info need to be correct at all?

I was trying to get delve to print stacktraces "correctly" when the stack contains cgo calls (here "correctly" means "matching the programmer's model").

Having a bad debug_frame entry for crosscall2 was a problem because I couldn't walk the crosscall2 frame and get to the runtime.asmcgocall frame.

I don't think other architectures are a priority since delve doesn't support them and other debuggers aren't in a position to take advantage of it anyway.

aarzilli commented 6 years ago

As far as I am concerned it can go under an 'unscheduled' tag instead of 1.11.