llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.43k stars 11.74k forks source link

__builtin_eh_return is not fully implemented for PPC nor PPC64: no DW_CFA_offset output for scratch registers( r3-r6) [still true of 8.0.0] #27218

Closed markmi closed 2 years ago

markmi commented 8 years ago
Bugzilla Link 26844
Resolution INVALID
Resolved on Jan 25, 2020 18:34
Version 8.0
OS FreeBSD
Blocks llvm/llvm-project#26154
CC @chmeeedalf,@DougGregor,@emaste,@hfinkel,@nemanjai

Extended Description

[This is for FreeBSD's projects/clang380-import-r296011 at the time of submittal.]

In the normal C++ ABI exception handling uses 4 scratch registers: "These scratch registers are reserved for passing arguments between the personality routine and the landing pads". This is for "an unwind context representing a handler frame, for which the personality routine will return _URC_INSTALL_CONTEXT" (_Unwind_SetGR).

Prior to executing in the landing pad the context record is used to restore various registers "to their state in the frame before the call that threw that exception": callee-saved (not altered by the personality routine) and scratch.

For TARGET_ARCH=powerpc (or powerpc64) FreeBSD's system libgcc_s _Unwind_RaiseException gets differing behavior for this between gcc/g++ and clang/clang++ for the scratch registers (and more):

A) gcc/g++ uses the 4 registers r3, r4, r5, r6 as the scratch registers and sets up to save/restore them, doing so being reported on by the .eh_frame information for _Unwind_RaiseException. It also sets up to save/restore R14-r31 and floating point registers as r46-r63. It also sets up to save/restore "r70" (holding the value from mfcr). (The powerpc ABI uses a bit in the cr for floating point usage information in the call standard.)

B) clang/clang++ does not set up to save/restore any scratch registers or a value from mfcr. It does set up to save/restore r14-r31 and floating point registers as r46-r63.

The result is that if things get to the point of the following FreeBSD code as part of handling the exception then the first _Unwind_SetGR gets a SEGV. __builtin_eh_return_data_regno (0) returns 3 to identify r3 as the register context.

678 / For targets with pointers smaller than the word size, we must extend the 679 pointer, and this extension is target dependent. / 680 _Unwind_SetGR (context, builtin_eh_return_data_regno (0), 681 builtin_extend_pointer (ue_header)); 682 _Unwind_SetGR (context, __builtin_eh_return_data_regno (1), 683 handler_switch_value); 684 _Unwind_SetIP (context, landing_pad);

As evidence of the difference for scratch register handling (and "r70"/cr handling). . .

gcc/g++ dwarfdump -v -v -F output extraction for _Unwind_RaiseException:

fde section offset 1104 0x00000450 cie offset for fde: 1108 0x00000454         0 DW_CFA_advance_loc 8  (8 1)         1 DW_CFA_def_cfa_offset 3024         4 DW_CFA_advance_loc1 156         6 DW_CFA_offset r4 -232  (58 -4)         8 DW_CFA_offset r3 -236  (59 -4)        10 DW_CFA_offset r28 -160  (40 -4)        12 DW_CFA_offset r27 -164  (41 -4)        14 DW_CFA_offset r26 -168  (42 -4)        16 DW_CFA_offset r25 -172  (43 -4)        18 DW_CFA_offset r24 -176  (44 -4)        20 DW_CFA_offset r23 -180  (45 -4)        22 DW_CFA_offset r22 -184  (46 -4)        24 DW_CFA_offset r21 -188  (47 -4)        26 DW_CFA_offset r20 -192  (48 -4)        28 DW_CFA_offset r19 -196  (49 -4)        30 DW_CFA_offset r18 -200  (50 -4)        32 DW_CFA_offset r17 -204  (51 -4)        34 DW_CFA_offset r16 -208  (52 -4)        36 DW_CFA_offset r15 -212  (53 -4)        38 DW_CFA_offset r14 -216  (54 -4)        40 DW_CFA_offset r63 -8  (2 -4)        42 DW_CFA_offset r62 -16  (4 -4)        44 DW_CFA_offset r61 -24  (6 -4)        46 DW_CFA_offset r60 -32  (8 -4)        48 DW_CFA_offset r59 -40  (10 -4)        50 DW_CFA_offset r58 -48  (12 -4)        52 DW_CFA_offset r57 -56  (14 -4)        54 DW_CFA_offset r56 -64  (16 -4)        56 DW_CFA_offset r55 -72  (18 -4)        58 DW_CFA_offset r54 -80  (20 -4)        60 DW_CFA_offset r53 -88  (22 -4)        62 DW_CFA_offset r52 -96  (24 -4)        64 DW_CFA_offset r51 -104  (26 -4)        66 DW_CFA_offset r50 -112  (28 -4)        68 DW_CFA_offset r49 -120  (30 -4)        70 DW_CFA_offset r48 -128  (32 -4)        72 DW_CFA_offset r47 -136  (34 -4)        74 DW_CFA_offset r46 -144  (36 -4)        76 DW_CFA_register r70 = r12        79 DW_CFA_offset_extended_sf r65 4  (-1 -4)        82 DW_CFA_advance_loc 32  (32 1)        83 DW_CFA_offset r5 -228  (57 -4)        85 DW_CFA_offset r31 -148  (37 -4)        87 DW_CFA_offset r30 -152  (38 -4)        89 DW_CFA_offset r29 -156  (39 -4)        91 DW_CFA_offset_extended r70 -220  (55 -4)        94 DW_CFA_offset r6 -224  (56 -4)        96 DW_CFA_nop        97 DW_CFA_nop        98 DW_CFA_nop

clang/clang++ 3.8.0 dwarfdump -v -v -F output extraction for _Unwind_RaiseException:

fde section offset 692 0x000002b4 cie offset for fde: 696 0x000002b8         0 DW_CFA_advance_loc 20  (5 4)         1 DW_CFA_def_cfa_offset 2992         4 DW_CFA_offset r31 -148  (37 -4)         6 DW_CFA_offset r30 -152  (38 -4)         8 DW_CFA_offset_extended_sf r65 4  (-1 -4)        11 DW_CFA_advance_loc 4  (1 4)        12 DW_CFA_def_cfa_register r31        14 DW_CFA_offset r14 -216  (54 -4)        16 DW_CFA_offset r15 -212  (53 -4)        18 DW_CFA_offset r16 -208  (52 -4)        20 DW_CFA_offset r17 -204  (51 -4)        22 DW_CFA_offset r18 -200  (50 -4)        24 DW_CFA_offset r19 -196  (49 -4)        26 DW_CFA_offset r20 -192  (48 -4)        28 DW_CFA_offset r21 -188  (47 -4)        30 DW_CFA_offset r22 -184  (46 -4)        32 DW_CFA_offset r23 -180  (45 -4)        34 DW_CFA_offset r24 -176  (44 -4)        36 DW_CFA_offset r25 -172  (43 -4)        38 DW_CFA_offset r26 -168  (42 -4)        40 DW_CFA_offset r27 -164  (41 -4)        42 DW_CFA_offset r28 -160  (40 -4)        44 DW_CFA_offset r29 -156  (39 -4)        46 DW_CFA_offset r30 -152  (38 -4)        48 DW_CFA_offset r31 -148  (37 -4)        50 DW_CFA_offset r46 -144  (36 -4)        52 DW_CFA_offset r47 -136  (34 -4)        54 DW_CFA_offset r48 -128  (32 -4)        56 DW_CFA_offset r49 -120  (30 -4)        58 DW_CFA_offset r50 -112  (28 -4)        60 DW_CFA_offset r51 -104  (26 -4)        62 DW_CFA_offset r52 -96  (24 -4)        64 DW_CFA_offset r53 -88  (22 -4)        66 DW_CFA_offset r54 -80  (20 -4)        68 DW_CFA_offset r55 -72  (18 -4)        70 DW_CFA_offset r56 -64  (16 -4)        72 DW_CFA_offset r57 -56  (14 -4)        74 DW_CFA_offset r58 -48  (12 -4)        76 DW_CFA_offset r59 -40  (10 -4)        78 DW_CFA_offset r60 -32  (8 -4)        80 DW_CFA_offset r61 -24  (6 -4)        82 DW_CFA_offset r62 -16  (4 -4)        84 DW_CFA_offset r63 -8  (2 * -4)        86 DW_CFA_nop

markmi commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#41050

markmi commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#41050

markmi commented 4 years ago

FreeBSD head (13) is now llvm/clang 9.0.1 based and looking at what it produces for powerpc64 shows that:

g++9 has r3-r6 scratch register information in the with_unwind_init_save_restore and with_unwind_init_and_eh_return_save_restore example code, even when based on FreeBSD headers and libraries:

g++9 -std=c++17 -std=c++17 -Wno-psabi -nostdinc -nostdinc++ -I/usr/include/c++/v1 -I/usr/include -nodefaultlibs -lc++ -lcxxrt -lthr -lm -lc -lgcc_s -o normal_and_special_save_restore.gcc normal_and_special_save_restore.cpp

leads to, for example, normal_and_special_save_restore.gcc having:

< 5><0x10000c6c:0x10000d04><><cie offset 0x00000114::cie index 0><fde offset 0x00000110 length: 0x00000030> <eh aug data len 0x0> 0x10000c6c: 0x10000c70: 0x10000c84: 0x10000ca0: <off r7 2=-48(cfa) > 0x10000cc0: . . .

C++ (clang++) does not have any r3, r4, r5, or r6 (scratch register) information in its dwarf information for the example.

However, with the use of llvm's libunwind instead of the old library code in head for powerpc64, the access(es) that expected to find scratch register descriptions do not happen and there is no SIGSEGV.

So, if there is some sort of problem here now, such as a clang vs. gcc ABI mismatch that prevents using one of the compilers, it is a different problem than was originally being reported as far as FeeBSD is concerned. (The original technical detail is still true but the library context changed.)

So a form of "overcome by events". As far as I can tell INVALID is the best-fit for the status, so that is what I'm selecting.

markmi commented 5 years ago

To avoid confusion: my ".rc3" typos were doubly wrong:

1) the notation really has no "c" 2) the .r3 notation is from the devel/llvm80 port

but I was speaking of the system compiler (installed as cc and c++). For -r344896 it reports:

cc -v

FreeBSD clang version 8.0.0 (branches/release_80 355313) (based on LLVM 8.0.0) Target: x86_64-unknown-freebsd13.0 Thread model: posix InstalledDir: /usr/bin

Notes:

Absent the scratch register DW_CFA_offset information, I still do not build FreeBSD itself via clang. This is because I want thrown exceptions to work.

I do build ports and such under FreeBSD via system-clang when the port defaults to the system compiler. For some ports I enable using the system compiler, basically other compilers.

FreeBSD for powerpc64 is still officially not clang based. My context is experimental.

markmi commented 5 years ago

I tried FreeBSD's clang from head -r344896 (so: 8.0.0.rc3) and there still is nothing analogous to:

    83 DW_CFA_offset r3 -320  (40 * -8)
    85 DW_CFA_offset r4 -312  (39 * -8)
    87 DW_CFA_offset r5 -304  (38 * -8)
    89 DW_CFA_offset r6 -296  (37 * -8)

for the scratch registers Ir3-r6) in routines that use __builtin_eh_return .

As such, this leads to segmentation faults when the CFA material is interpreted and tries to put the scratch registers to use.

(I had dwarfdump report for the output of a different compiler to make the list above.)

llvmbot commented 6 years ago

Summary - __builtin_eh_return is not implemented at all in LLVM for neither PPC nor PPC64.

I have a WIP patch at http://vlakno.cz/~rdivacky/ppc64.eh_return.patch if anyone wants to take a look and finish it.

markmi commented 7 years ago

The following short, simple program is intended to be inspected with dwarfdump and objddump and the like, not to be run. It allows basic inspecting for compatibility between the likes of devel/powerpc64-gcc (/usr/local/bin/powerpc64-unknown-freebsd12.0-g++) based compiles and clang++ based compiles for DWCFA<?> material for exception handling.

-O2 notes are about powerpc64-gcc's handling.

Being self contained does limit what it can show for r2 handling in the DWCFA<?> sequences. It does show r3-r6 scratch register handling. More than this test case is needed overall.

more normal_and_special_save_restore.cpp

void normal_save_restore() { } void with_unwind_init_save_restore() { __builtin_unwind_init(); } void with_eh_return_save_restore() { long offset = 0; void *handler = 0;

__builtin_eh_return(offset,handler); } void with_unwind_init_and_eh_return_save_restore() { long offset = 0; void *handler = 0;

builtin_unwind_init(); builtin_eh_return(offset,handler); }

void use_normal_save_restore() { normal_save_restore(); // -O2 inlined the behavior } void use_with_unwind_init_save_restore() { with_unwind_init_save_restore(); // -O2 inlined the behavior } void use_with_eh_return_save_restore() { with_eh_return_save_restore(); } void use_with_unwind_init_and_eh_return_save_restore() { with_unwind_init_and_eh_return_save_restore(); }

volatile int value;

int main(void) { if (0==value) use_normal_save_restore(); // -O2 inlined the behavior if (1==value) use_with_unwind_init_save_restore(); // -O2 inlined the behavior if (2==value) use_with_eh_return_save_restore(); // -O2 actually calls with_eh_return_save_restore() if (3==value) use_with_unwind_init_and_eh_return_save_restore(); // -O2 actually calls with_unwind_init_and_eh_return_save_restore() return 0; }

/usr/local/bin/powerpc64-unknown-freebsd12.0-g++ -std=c++11 -g -O2 -c -o normal_and_special_save_restore.gcc6 normal_and_special_save_restore.cpp

Note: dwarfdump -v -v -F normal_and_special_save_restore.gcc6 shows some storage for which no explicit store or access was generated (or needed). This was for the scratch registers.

For example: with_eh_return_save_restore lists:

    0 DW_CFA_advance_loc 16  (4 * 4)
    1 DW_CFA_offset r3 -32  (4 * -8)
    3 DW_CFA_offset r4 -24  (3 * -8)
    5 DW_CFA_offset r5 -16  (2 * -8)
    7 DW_CFA_offset r6 -8  (1 * -8)
    9 DW_CFA_advance_loc 8  (2 * 4)
   10 DW_CFA_offset_extended_sf r2 40  (-5 * -8)
   13 DW_CFA_nop
   14 DW_CFA_nop

but there are no explicit saves or restores of r3-r6 in the code but space is reserved on the stack for them.

Having arguments to the routines would cause the save/restores to be in the code.

clang++ -std=c++11 -g -O2 -o normal_and_special_save_restore.clang normal_and_special_save_restore.cpp

dwarfdump -v -v -F normal_and_special_save_restore.clang does not work: empty output. Trying just:

dwarfdump normal_and_special_save_restore.clang

shows .debug_frame information and more but not .eh_frame output: it appears that clang omitted it. This was also true with -B /usr/local/powerpc64-freebsd/bin/ being in use.

markmi commented 7 years ago

An update based on looking at what FreeBSD's clang 5 is doing and not doing compared to FreeBSD's devel/powerpc64-gcc port.

It is still broken and thrown C++ exceptions attempt to dereference a NULL pointer and fail.

Overall the following pair are supposed to control the behavior for the save/restores relative to special C++ exception handling routines in FreeBSD library code (such as its libgcc_s.so.1 ):

builtin_unwind_init() builtin_eh_return(offset, handler)

The presence of the 2nd one is supposed to include causing the scratch register handling for the ABI involved as I understand. This is not happening. That is what leads to the lack of information that causes the NULL that ends up being used as I understand.

For powerpc64 (and powerpc) the scratch registers are r2-r6 as visible in what powerpc64-gcc generates. The first attempted _Unwind_SetGR during a throw is for r3 and it fails.

I found the following:

/usr/src/contrib/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp: // Functions which call __builtin_unwind_init get all their registers saved.

More completely:

grep -ir builtin_unwind_init /usr/src/contrib/* | more /usr/src/contrib/gcc/config/ia64/unwind-ia64.c: builtin_unwind_init(); \

/usr/src/contrib/gcc/unwind-dw2.c: builtin_unwind_init (); \ /usr/src/contrib/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp: // Functions which call builtin_unwind_init get all their registers saved. /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp: case Builtin::BIbuiltin_unwind_init: { /usr/src/contrib/llvm/tools/clang/include/clang/Basic/Builtins.def:BUILTIN(builtin_unwind_init, "v", "") /usr/src/contrib/llvm/include/llvm/IR/Intrinsics.td:// builtin_unwind_init is an undocumented GCC intrinsic that causes all /usr/src/contrib/llvm/include/llvm/IR/Intrinsics.td: GCCBuiltin<"builtin_unwind_init">;

and quoting more of that last file:

// builtin_unwind_init is an undocumented GCC intrinsic that causes all // callee-saved registers to be saved and restored (regardless of whether they // are used) in the calling function. It is used by libgcc_eh. def int_eh_unwind_init: Intrinsic<[]>, GCCBuiltin<"builtin_unwind_init">;

That comment is incomplete by not mentioning the scratch registers defined by the exception handling ABI being saved and restored in at least some contexts [r2-r6 for powerpc64 and powerpc] --matching the odd behavior.

void TargetFrameLowering::determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS) const { . . . // Functions which call __builtin_unwind_init get all their registers saved. bool CallsUnwindInit = MF.callsUnwindInit(); const MachineRegisterInfo &MRI = MF.getRegInfo(); for (unsigned i = 0; CSRegs[i]; ++i) { unsigned Reg = CSRegs[i]; if (CallsUnwindInit || MRI.isPhysRegModified(Reg)) SavedRegs.set(Reg); } }

There is no powerpc64 or powerpc equivalent to the gcc code that I found for that I found for __builtin_eh_return. But there is mips code that does something special (not necessarily to what powerpc64 or powerpc should do):

/usr/src/contrib/llvm/lib/Target/Mips/MipsISelLowering.cpp

with:

// An EH_RETURN is the result of lowering llvm.eh.return which in turn is // generated from __builtin_eh_return (offset, handler) // The effect of this is to adjust the stack pointer by "offset" // and then branch to "handler". SDValue MipsTargetLowering::lowerEH_RETURN(SDValue Op, SelectionDAG &DAG) const { MachineFunction &MF = DAG.getMachineFunction(); MipsFunctionInfo *MipsFI = MF.getInfo();

MipsFI->setCallsEhReturn(); SDValue Chain = Op.getOperand(0); SDValue Offset = Op.getOperand(1); SDValue Handler = Op.getOperand(2); SDLoc DL(Op); EVT Ty = ABI.IsN64() ? MVT::i64 : MVT::i32;

// Store stack offset in V1, store jump target in V0. Glue CopyToReg and // EH_RETURN nodes, so that instructions are emitted back-to-back. unsigned OffsetReg = ABI.IsN64() ? Mips::V1_64 : Mips::V1; unsigned AddrReg = ABI.IsN64() ? Mips::V0_64 : Mips::V0; Chain = DAG.getCopyToReg(Chain, DL, OffsetReg, Offset, SDValue()); Chain = DAG.getCopyToReg(Chain, DL, AddrReg, Handler, Chain.getValue(1)); return DAG.getNode(MipsISD::EH_RETURN, DL, MVT::Other, Chain, DAG.getRegister(OffsetReg, Ty), DAG.getRegister(AddrReg, getPointerTy(MF.getDataLayout())), Chain.getValue(1)); }

Hexagon also has code, as does X86.

But nothing special for powerpc64 or powerpc. As I understand what I've seen there should be code tied to __builtin_eh_return use that causes r2-r6 save/restore as well.

markmi commented 7 years ago

[Context: FreeBSD head -r317820 system clang with a couple of patches I've been given to test for folks.]

Admittedly the following is based in part on a couple of patches that I'd been given to test for folks but I figure the following change of status details is likely important relative to various specifics that I wrote originally:

For:

include

int main(void) { try { throw std::exception(); } catch (std::exception& e) {} return 0; }

I now get a different backtrace when it fails. . .

Program received signal SIGSEGV, Segmentation fault. 0x41b355d0 in _Unwind_SetGR (context=, index=, val=1105281072) at unwind-dw2-fde.h:162 162 { Current language: auto; currently minimal (gdb) bt

​0 0x41b355d0 in _Unwind_SetGR (context=, index=, val=1105281072) at unwind-dw2-fde.h:162

​1 0x4192e370 in __gxx_personality_v0 (version=, actions=, exceptionObject=0x41e14030, context=0xffffd590) at /usr/src/contrib/libcxxrt/exception.cc:1203

​2 0x41b36234 in _Unwind_RaiseException_Phase2 (exc=, context=) at unwind.inc:66

​3 0x41b35e10 in _Unwind_RaiseException (exc=0xffffd590) at unwind.inc:135

​4 0x4192d870 in __cxa_throw (thrown_exception=, tinfo=, dest=) at /usr/src/contrib/libcxxrt/exception.cc:774

​5 0x01800954 in main () at exception_test.cpp:5

markmi commented 7 years ago

FreeBSD status update. . .

FYI: FreeBSD has merged clang/llvm 4.0 into stable/11 (-r316423) in addition to head (12) some time ago. release/11.1.0 will be clang/llvm 4.0 based.

TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc still does not handle thrown exceptions in this clang version.

TARGET_ARCH=powerpc still has the R30 vs. tack handling vs. floating point code problem: restored R30 and then generates floating point cleanup code that expects R30 to not have changed yet.

markmi commented 7 years ago

FreeBSD status update. . .

stable/11 from FreeBSD has for a time had as its system compiler its variant of 3.9.1 . So the FreeBSD 11.1 release will have the powerpc64 and powerpc fixes that have been applied plus any others that they merge in before then. And so on for 11.x (1<x).

(Unfortunately powerpc still has stack handling problems relative to r30 usage [26519] and floating point code and C++ exception handling does not work for powerpc or powerpc64.)

I'd guess that head (12) and stable/11 would continue to eventually pick up powerpc and powerpc64 fixes that fit in clang 3.9.1 easily.

I'll also note that projects/clang400-import has started so at least the eventual 12.x releases will likely be based on clang 4.0.x . I've no clue if stable/11 might pick up such at some point.

markmi commented 7 years ago

A note about the context for this report:

Summary: TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc are still based on the gcc unwind library in FreeBSD 12 (so far) and in 11. My reports are about that context in the details presented. Avoid comparing dwarfdump output from llvm's libunwind unless you know what differences to expect.

FreeBSD 11 and 12 have a control for picking which unwind library is built in buildworld:

WITHOUT_LLVM_LIBUNWIND= (meaning the historical gcc library is used)

WITH_LLVM_LIBUNWIND=

In FreeBSD 11 only TARGET_ARCH=aarch64 uses WITH_LLVM_LIBUNWIND= by default. (In fact aarch64 requires it as I understand.)

In head (12) more TARGET_ARCH's have WITH_LLVM_LIBUNWIND= as the default at this stage. Other than aarch64 they allow picking which way to build because they had prior gcc based library support.

powerpc64 and powerpc still have WITHOUT_LLVM_LIBUNWIND= as the default in FreeBSD 12. I asked and the intent for 12 is to be able to use either unwind library for these. It is not clear to me if the default will survive to 12-CURRENT (which is a long ways off).

But FreeBSD 11 will not switch the default for powerpc64 or powerpc because of actively supporting gcc 4.2.1 based builds, with optional clang support possibly showing up via "Move From Current (12)" activity since clang 3.9.0 is scheduled to be MFC'd.

If one accidentally compares dwarfdump materials from FreeBSD's built with opposite selections for libunwind it can lead to confusions that might leave one expecting some defect(s) for unexplained differences.

Everything that I've reported here for dwarf C++ exception handling information and its use is for the gcc unwind library context.

markmi commented 7 years ago

Is this is a FreeBSD-specific problem; or does this cause problems on Linux as well?

I set up a Ubuntu Mate 16.10 powerpc64 environment and used dwarfdump on its /lib64/libgcc_s.so.1 . (Of course it was built with gcc compilers, not clang.) It shows things like. . .

The 4 scratch registers are explicitly listed for the appropriate exception handling routines that allow use of them, for example:

    83 DW_CFA_offset r3 -320  (40 * -8)
    85 DW_CFA_offset r4 -312  (39 * -8)
    87 DW_CFA_offset r5 -304  (38 * -8)
    89 DW_CFA_offset r6 -296  (37 * -8)

(No individual DW_CFA_restore for any of those 4.)

It also shows r70 use.

The full sequence for one of the relevant routines is:

fde section offset 1544 0x00000608 cie offset for fde: 1548 0x0000060c 0 DW_CFA_advance_loc 8 (2 4) 1 DW_CFA_def_cfa_offset 5648 4 DW_CFA_register r65 = r0 7 DW_CFA_advance_loc 168 (42 4) 8 DW_CFA_offset_extended_sf r65 16 (-2 -8) 11 DW_CFA_offset r46 -144 (18 -8) 13 DW_CFA_offset r47 -136 (17 -8) 15 DW_CFA_offset r48 -128 (16 -8) 17 DW_CFA_offset r49 -120 (15 -8) 19 DW_CFA_offset r50 -112 (14 -8) 21 DW_CFA_offset r51 -104 (13 -8) 23 DW_CFA_offset r52 -96 (12 -8) 25 DW_CFA_offset r53 -88 (11 -8) 27 DW_CFA_offset r54 -80 (10 -8) 29 DW_CFA_offset r55 -72 (9 -8) 31 DW_CFA_offset r56 -64 (8 -8) 33 DW_CFA_offset r57 -56 (7 -8) 35 DW_CFA_offset r58 -48 (6 -8) 37 DW_CFA_offset r59 -40 (5 -8) 39 DW_CFA_offset r60 -32 (4 -8) 41 DW_CFA_offset r61 -24 (3 -8) 43 DW_CFA_offset r62 -16 (2 -8) 45 DW_CFA_offset r63 -8 (1 -8) 47 DW_CFA_offset r14 -288 (36 -8) 49 DW_CFA_offset r15 -280 (35 -8) 51 DW_CFA_offset r16 -272 (34 -8) 53 DW_CFA_offset r17 -264 (33 -8) 55 DW_CFA_offset r18 -256 (32 -8) 57 DW_CFA_offset r19 -248 (31 -8) 59 DW_CFA_offset r20 -240 (30 -8) 61 DW_CFA_offset r21 -232 (29 -8) 63 DW_CFA_offset r22 -224 (28 -8) 65 DW_CFA_offset r23 -216 (27 -8) 67 DW_CFA_offset r24 -208 (26 -8) 69 DW_CFA_offset r25 -200 (25 -8) 71 DW_CFA_offset r26 -192 (24 -8) 73 DW_CFA_offset r27 -184 (23 -8) 75 DW_CFA_offset r28 -176 (22 -8) 77 DW_CFA_offset r29 -168 (21 -8) 79 DW_CFA_offset r30 -160 (20 -8) 81 DW_CFA_offset r31 -152 (19 -8) 83 DW_CFA_offset r3 -320 (40 -8) 85 DW_CFA_offset r4 -312 (39 -8) 87 DW_CFA_offset r5 -304 (38 -8) 89 DW_CFA_offset r6 -296 (37 -8) 91 DW_CFA_advance_loc 28 (7 4) 92 DW_CFA_offset_extended_sf r2 40 (-5 -8) 95 DW_CFA_offset_extended_sf r70 8 (-1 -8) 98 DW_CFA_advance_loc 236 (59 4) 99 DW_CFA_remember_state 100 DW_CFA_restore_extended r65 102 DW_CFA_advance_loc 28 (7 4) 103 DW_CFA_restore_extended r70 105 DW_CFA_advance_loc 112 (28 4) 106 DW_CFA_restore r63 107 DW_CFA_restore r62 108 DW_CFA_restore r61 109 DW_CFA_restore r60 110 DW_CFA_restore r59 111 DW_CFA_restore r58 112 DW_CFA_restore r57 113 DW_CFA_restore r56 114 DW_CFA_restore r55 115 DW_CFA_restore r54 116 DW_CFA_restore r53 117 DW_CFA_restore r52 118 DW_CFA_restore r51 119 DW_CFA_restore r50 120 DW_CFA_restore r49 121 DW_CFA_restore r48 122 DW_CFA_restore r47 123 DW_CFA_restore r46 124 DW_CFA_restore r31 125 DW_CFA_restore r30 126 DW_CFA_restore r29 127 DW_CFA_restore r28 128 DW_CFA_restore r27 129 DW_CFA_restore r26 130 DW_CFA_restore r25 131 DW_CFA_restore r24 132 DW_CFA_restore r23 133 DW_CFA_restore r22 134 DW_CFA_restore r21 135 DW_CFA_restore r20 136 DW_CFA_restore r19 137 DW_CFA_restore r18 138 DW_CFA_restore r17 139 DW_CFA_restore r16 140 DW_CFA_restore r15 141 DW_CFA_restore r14 142 DW_CFA_def_cfa_offset 0 144 DW_CFA_advance_loc 12 (3 4) 145 DW_CFA_restore_state 146 DW_CFA_nop 147 DW_CFA_nop 148 DW_CFA_nop 149 DW_CFA_nop 150 DW_CFA_nop

(I'm not aware of any linux that is clang built for powerpc64. I am aware of one for amd64 these days: https://sourceforge.net/projects/openmandriva/files/release/3.0/ .)

markmi commented 7 years ago

I include a short, simple program showing a C++ exception failure, not necessarily for a scratch register reference during the handling.

include

int main(void) { try { throw std::exception(); } catch (std::exception& e) {} return 0; }

Which gets:

Program terminated with signal SIGABRT, Aborted.

​0 0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7

(gdb) bt

​0 0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7

​1 0x00000000502f8818 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52

​2 0x00000000502f8748 in abort () at /usr/src/lib/libc/stdlib/abort.c:65

​3 0x00000000501c9cbc in _Unwind_GetGR (context=, index=65) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:180

​4 uw_update_context_1 (context=, fs=) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1353

​5 0x00000000501c78b0 in uw_init_context_1 (context=0xffffffffffffd1e0, outer_cfa=0xffffffffffffd940, outer_ra=0x50179ea0 <__cxa_throw(void, std::type_info, void ()(void))+248>)

at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1442

​6 0x00000000501c7418 in _Unwind_RaiseException (exc=) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:92

​7 0x0000000050179ea0 in throw_exception (ex=) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:774

​8 __cxa_throw (thrown_exception=, tinfo=, dest=) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:801

​9 0x0000000010000cf0 in .main ()

(gdb) up 3

​3 0x00000000501c9cbc in _Unwind_GetGR (context=, index=65) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:180

180 gcc_assert (size == sizeof(_Unwind_Word)); (gdb) list 175 / This will segfault if the register hasn't been saved. / 176 if (size == sizeof(_Unwind_Ptr)) 177 return (_Unwind_Ptr ) ptr; 178 else 179 { 180 gcc_assert (size == sizeof(_Unwind_Word)); 181 return (_Unwind_Word ) ptr; 182 } 183 } 184

markmi commented 7 years ago

Context for this note: powerpc64 example from a minor variation of FreeBSD's head -r309179 and its clang 3.9.0 . (Before I'd used powerpc but for now powerpc64 is easier to deal with.) This clang has the powerpc and powerpc64 patches that have been explicitly reported as having been made.

Later below is from a dwarfdump of the _Unwind_RaiseException code. One thing it still shows is a lack of records for the 4 registers r3, r4, r5, r6 that are used as the scratch registers from the ABI. Because of this _Unwind_SetGR and _Unwind_GetGR SIGSEGV when the scratch registers are what is being accessed in the exception handling code.

By contrast there now are r70 references that might be for the value from mfcr so some points of 26844 may have been addressed.

[Modern FreeBSD also handles more registers from modern powerpc64 variants now. The below seems to have additional material for that --but I've not matched things up 1-to-1 or the like.]

fde section offset 696 0x000002b8 cie offset for fde: 700 0x000002bc 0 DW_CFA_advance_loc 24 (6 4) 1 DW_CFA_def_cfa_offset 5840 4 DW_CFA_offset r31 -152 (19 -8) 6 DW_CFA_offset_extended_sf r65 16 (-2 -8) 9 DW_CFA_advance_loc 4 (1 4) 10 DW_CFA_def_cfa_register r31 12 DW_CFA_offset r14 -288 (36 -8) 14 DW_CFA_offset r15 -280 (35 -8) 16 DW_CFA_offset r16 -272 (34 -8) 18 DW_CFA_offset r17 -264 (33 -8) 20 DW_CFA_offset r18 -256 (32 -8) 22 DW_CFA_offset r19 -248 (31 -8) 24 DW_CFA_offset r20 -240 (30 -8) 26 DW_CFA_offset r21 -232 (29 -8) 28 DW_CFA_offset r22 -224 (28 -8) 30 DW_CFA_offset r23 -216 (27 -8) 32 DW_CFA_offset r24 -208 (26 -8) 34 DW_CFA_offset r25 -200 (25 -8) 36 DW_CFA_offset r26 -192 (24 -8) 38 DW_CFA_offset r27 -184 (23 -8) 40 DW_CFA_offset r28 -176 (22 -8) 42 DW_CFA_offset r29 -168 (21 -8) 44 DW_CFA_offset r30 -160 (20 -8) 46 DW_CFA_offset r46 -144 (18 -8) 48 DW_CFA_offset r47 -136 (17 -8) 50 DW_CFA_offset r48 -128 (16 -8) 52 DW_CFA_offset r49 -120 (15 -8) 54 DW_CFA_offset r50 -112 (14 -8) 56 DW_CFA_offset r51 -104 (13 -8) 58 DW_CFA_offset r52 -96 (12 -8) 60 DW_CFA_offset r53 -88 (11 -8) 62 DW_CFA_offset r54 -80 (10 -8) 64 DW_CFA_offset r55 -72 (9 -8) 66 DW_CFA_offset r56 -64 (8 -8) 68 DW_CFA_offset r57 -56 (7 -8) 70 DW_CFA_offset r58 -48 (6 -8) 72 DW_CFA_offset r59 -40 (5 -8) 74 DW_CFA_offset r60 -32 (4 -8) 76 DW_CFA_offset r61 -24 (3 -8) 78 DW_CFA_offset r62 -16 (2 -8) 80 DW_CFA_offset r63 -8 (1 -8) 82 DW_CFA_offset_extended_sf r70 8 (-1 -8) 85 DW_CFA_offset_extended_sf r70 8 (-1 -8) 88 DW_CFA_offset_extended_sf r70 8 (-1 -8) 91 DW_CFA_offset_extended r97 -496 (62 -8) 94 DW_CFA_offset_extended r98 -480 (60 -8) 97 DW_CFA_offset_extended r99 -464 (58 -8) 100 DW_CFA_offset_extended r100 -448 (56 -8) 103 DW_CFA_offset_extended r101 -432 (54 -8) 106 DW_CFA_offset_extended r102 -416 (52 -8) 109 DW_CFA_offset_extended r103 -400 (50 -8) 112 DW_CFA_offset_extended r104 -384 (48 -8) 115 DW_CFA_offset_extended r105 -368 (46 -8) 118 DW_CFA_offset_extended r106 -352 (44 -8) 121 DW_CFA_offset_extended r107 -336 (42 -8) 124 DW_CFA_offset_extended r108 -320 (40 -8)

markmi commented 8 years ago

PPCFrameLowering::determineCalleeSaves has no code for the Itanium-like C++ Exception ABI's 4 scratch registers [r3, r4, r5, r6] for __builtin_unwind_init() contexts.

Apparently for __builtin_unwind_init() contexts (and any others?)

// For 32-bit SVR4, allocate the nonvolatile CR spill slot iff the // function uses CR 2, 3, or 4. if (!isPPC64 && !isDarwinABI && (SavedRegs.test(PPC::CR2) || SavedRegs.test(PPC::CR3) || SavedRegs.test(PPC::CR4))) { int FrameIdx = MFI->CreateFixedObject((uint64_t)4, (int64_t)-4, true); FI->setCRSpillFrameIndex(FrameIdx); }

in PPCFrameLowering::determineCalleeSaves [in Target/PowerPC/PPCFrameLowering.cpp] is not sufficient for the .eh_frame information to record the save/restore of (fields of) the CR.

This leaves the C++ Exception handing as not preserving these CR fields.

(Technically volatile-status need not be preserved but non-volatile-status needs to be preserved: In other words the whole CR could be restored and still comply with the SVR4 ABI.)

Side note:

PPCFrameLowering::determineCalleeSaves may be too late for the "double Frame Pointer Register store" problem.

markmi commented 8 years ago

[The following notes were written for a different bug report but part of the content overlaps with this one too. I include it all for context but the last 4 paragraphs or so are what directly apply here.]

For __builtin_unwind_init() consequences (desired vs. actual) and TargetFrameLowering::determineCalleeSaves behavior for at least powerpc and powerpc64. . .

When something like a Frame Pointer is already in use in how the call standard is implemented the code will already save the register and update it independent of the "SavedRegs.set(Reg)" status for the register.

And, in fact, "SavedRegs.set(Reg)" on the Frame Pointer Register that gets the save behavior after Frame Register update will replace the value that should later be restored with the updated value that should not be what is later restored. Such would not be a redundant store but instead an incorrect store.

That is what is actually happening for powerpc and powerpc64. (Any others?)

So for a call standard such as for powerpc64 where the Frame Pointer Register (such as r31) is saved and updated before the (other) callee saved registers are saved the Frame Pointer Register must not also be one of the "SavedRegs.set(Reg)" registers.

This is true even when "MF.getMMI().callsUnwindInit()" [__builtin_unwind_init() usage] indicates that overall all the potential callee-saved registers should be saved and restored: The Frame Pointer already is saved and restored and should not be "double saves/double restored".

The same is true for the plain powerpc code generation technique that is currently in use (but that violates the SVR4 call/return ABI properties by being more AIX-like that requires a red-zone on the low address side of the stack). It is likely true for correct SVR4 ABI stack pointer handling/timing as well.

For all I know there would be other machine architectures with the issue as well.

Another "MF.getMMI().callsUnwindInit()" [__builtin_unwind_init() usage] related issue seems to be that the .eh_frame information needs to span things like Fields CR2, CR3, and CR4 of the condition register (CR) for powerpc being non-volatile in SVR4's ABI: The FDE information does not seem to be recording everything that is non-volatile for this ABI (and possibly others). (Likely the whole CR is saved and the restore code would be the part to deal with selective restore if the other fields of CR are to actually guarantee volatile status. Volatile status does not need to be guaranteed from what I can tell.)

Yet another issue related to "MF.getMMI().callsUnwindInit()" [__builtin_unwind_init() usage] for when the Itanium C++ Exception ABI is in use is that there are 4 scratch registers in addition to the normal callee-saves/restored registers. As stands for powerpc and powerpc64 no place to save the scratch register values has been set up so that they can be restored for the landing pad code --and the .eh_frame information makes no reference to the scratch registers at all.

gcc generates save/restore code for r3, r4, r5, and r6 for the 4 scratch registers for powerpc and powerpc64 --and explicitly records the information in the FDE in the .eh_frame information. FreeBSD's system libgcc_s depends on this, as an example. It gets a SEGV from trying to use address zero restore r3.

It would appear to me that TargetFrameLowering::determineCalleeSaves should deal with the exception ABI's scratch registers (when there are such) but that is does not explicitly do so for any machine architecture. (Is all the logic elsewhere?)

markmi commented 8 years ago

For the clang 3.8.0 mfcr/mtcr related save/restore code generation for _Unwind_RaiseException it turns out that --even though the .eh_frame information does not list it at all-- the actual code has cr saved and parts restored:

Dump of assembler code for function _Unwind_RaiseException: 0x41b2ab80 <+0>: mflr r0 0x41b2ab84 <+4>: stw r31,-148(r1) 0x41b2ab88 <+8>: stw r30,-152(r1) 0x41b2ab8c <+12>: stw r0,4(r1) 0x41b2ab90 <+16>: stwu r1,-2992(r1) 0x41b2ab94 <+20>: mr r31,r1 0x41b2ab98 <+24>: mfcr r12 . . . 0x41b2ac2c <+172>: stw r12,2772(r31) . . . 0x41b2ad84 <+516>: lwz r12,2772(r31) . . . 0x41b2add8 <+600>: mtcrf 32,r12 . . . 0x41b2ade8 <+616>: mtcrf 16,r12 . . . 0x41b2adf8 <+632>: mtcrf 8,r12 . . .

Unfortunately the C++ Exception handling via interpreting .eh_frame will not take advantage of the "2772(r31)" in any way because of the lack of such .eh_frame information.

(Of course the main point of the submittal was the lack of the ABI's scratch registers.)

markmi commented 8 years ago

Is this is a FreeBSD-specific problem; or does this cause problems on Linux as well?

I do not have a Linux environment to investigate with at this time, much less one based on clang 3.8.0. (Nor am I familiar with that kind of context at this point.) I also do not have prior history to know about older clang's. (clang is just now getting close to being able to do a FreeBSD "buildworld" that generally works.)

My understanding is that all the processor specific variants of the C++ Itanium ABI have the 4-scratch-register related requirements (and that the cr-in-call-standard usage is SVR4 ABI material). So I expect that any environment based on the C++ Itanium ABI would need to produce the scratch register related .eh_frame material (and the matching code).

I do not know just what the compilation keys off of to decide to generate more saves/restores than _Unwind_RaiseException actually has register usage for. But clang did set up the full save/restore for r14-r31 and the floating point registers despite _Unwind_RaiseException not using all of them. So something got triggered somehow to save/restore more than normal.

As the .eh_frame material (and related code) comes out of clang 3.8.0 I do not see how this would be "FreeBSD only" unless some patch/misconfiguration was involved that made clang 3.8.0 a variant. But I do not know what to look for to check on such issues on the FreeBSD side. With pointers to what to look for I might be able to check or get others to check.

hfinkel commented 8 years ago

Is this is a FreeBSD-specific problem; or does this cause problems on Linux as well?