Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

__builtin_eh_return is not fully implemented for PPC nor PPC64: no DW_CFA_offset output for scratch registers( r3-r6) [still true of 8.0.0] #26843

Closed Quuxplusone closed 4 years ago

Quuxplusone commented 8 years ago
Bugzilla Link PR26844
Status RESOLVED INVALID
Importance P normal
Reported by Mark Millard (marklmi26-fbsd@yahoo.com)
Reported on 2016-03-04 15:36:29 -0800
Last modified on 2020-01-25 18:34:05 -0800
Version 8.0
Hardware Other FreeBSD
CC chmeeedalf@gmail.com, dgregor@apple.com, emaste@freebsd.org, hfinkel@anl.gov, kit.barton@gmail.com, llvm-bugs@lists.llvm.org, nemanja.i.ibm@gmail.com, rdivacky@freebsd.org, ulrich.weigand@de.ibm.com, willdtz@gmail.com, wschmidt@linux.vnet.ibm.com
Fixed by commit(s)
Attachments
Blocks PR25780
Blocked by
See also
[This is for FreeBSD's projects/clang380-import-r296011 at the time of
submittal.]

In the normal C++ ABI exception handling uses 4 scratch registers: "These
scratch registers are reserved for passing arguments between the personality
routine and the landing pads". This is for "an unwind context representing a
handler frame, for which the personality routine will return
_URC_INSTALL_CONTEXT" (_Unwind_SetGR).

Prior to executing in the landing pad the context record is used to restore
various registers "to their state in the frame before the call that threw that
exception": callee-saved (not altered by the personality routine) and scratch.

For TARGET_ARCH=powerpc (or powerpc64) FreeBSD's system libgcc_s
_Unwind_RaiseException gets differing behavior for this between gcc/g++ and
clang/clang++ for the scratch registers (and more):

A) gcc/g++ uses the 4 registers r3, r4, r5, r6 as the scratch registers and
sets up to save/restore them, doing so being reported on by the .eh_frame
information for _Unwind_RaiseException. It also sets up to save/restore R14-r31
and floating point registers as r46-r63. It also sets up to save/restore "r70"
(holding the value from mfcr). (The powerpc ABI uses a bit in the cr for
floating point usage information in the call standard.)

B) clang/clang++ does not set up to save/restore any scratch registers or a
value from mfcr. It does set up to save/restore r14-r31 and floating point
registers as r46-r63.

The result is that if things get to the point of the following FreeBSD code as
part of handling the exception then the first _Unwind_SetGR gets a SEGV.
__builtin_eh_return_data_regno (0) returns 3 to identify r3 as the register
context.

678   /* For targets with pointers smaller than the word size, we must extend
the
679      pointer, and this extension is target dependent.  */
680   _Unwind_SetGR (context, __builtin_eh_return_data_regno (0),
681          __builtin_extend_pointer (ue_header));
682   _Unwind_SetGR (context, __builtin_eh_return_data_regno (1),
683          handler_switch_value);
684   _Unwind_SetIP (context, landing_pad);

As evidence of the difference for scratch register handling (and "r70"/cr
handling). . .

gcc/g++ dwarfdump -v -v -F output extraction for _Unwind_RaiseException:

fde section offset 1104 0x00000450 cie offset for fde: 1108 0x00000454
        0 DW_CFA_advance_loc 8  (8 * 1)
        1 DW_CFA_def_cfa_offset 3024
        4 DW_CFA_advance_loc1 156
        6 DW_CFA_offset r4 -232  (58 * -4)
        8 DW_CFA_offset r3 -236  (59 * -4)
       10 DW_CFA_offset r28 -160  (40 * -4)
       12 DW_CFA_offset r27 -164  (41 * -4)
       14 DW_CFA_offset r26 -168  (42 * -4)
       16 DW_CFA_offset r25 -172  (43 * -4)
       18 DW_CFA_offset r24 -176  (44 * -4)
       20 DW_CFA_offset r23 -180  (45 * -4)
       22 DW_CFA_offset r22 -184  (46 * -4)
       24 DW_CFA_offset r21 -188  (47 * -4)
       26 DW_CFA_offset r20 -192  (48 * -4)
       28 DW_CFA_offset r19 -196  (49 * -4)
       30 DW_CFA_offset r18 -200  (50 * -4)
       32 DW_CFA_offset r17 -204  (51 * -4)
       34 DW_CFA_offset r16 -208  (52 * -4)
       36 DW_CFA_offset r15 -212  (53 * -4)
       38 DW_CFA_offset r14 -216  (54 * -4)
       40 DW_CFA_offset r63 -8  (2 * -4)
       42 DW_CFA_offset r62 -16  (4 * -4)
       44 DW_CFA_offset r61 -24  (6 * -4)
       46 DW_CFA_offset r60 -32  (8 * -4)
       48 DW_CFA_offset r59 -40  (10 * -4)
       50 DW_CFA_offset r58 -48  (12 * -4)
       52 DW_CFA_offset r57 -56  (14 * -4)
       54 DW_CFA_offset r56 -64  (16 * -4)
       56 DW_CFA_offset r55 -72  (18 * -4)
       58 DW_CFA_offset r54 -80  (20 * -4)
       60 DW_CFA_offset r53 -88  (22 * -4)
       62 DW_CFA_offset r52 -96  (24 * -4)
       64 DW_CFA_offset r51 -104  (26 * -4)
       66 DW_CFA_offset r50 -112  (28 * -4)
       68 DW_CFA_offset r49 -120  (30 * -4)
       70 DW_CFA_offset r48 -128  (32 * -4)
       72 DW_CFA_offset r47 -136  (34 * -4)
       74 DW_CFA_offset r46 -144  (36 * -4)
       76 DW_CFA_register r70 = r12
       79 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
       82 DW_CFA_advance_loc 32  (32 * 1)
       83 DW_CFA_offset r5 -228  (57 * -4)
       85 DW_CFA_offset r31 -148  (37 * -4)
       87 DW_CFA_offset r30 -152  (38 * -4)
       89 DW_CFA_offset r29 -156  (39 * -4)
       91 DW_CFA_offset_extended r70 -220  (55 * -4)
       94 DW_CFA_offset r6 -224  (56 * -4)
       96 DW_CFA_nop
       97 DW_CFA_nop
       98 DW_CFA_nop

clang/clang++ 3.8.0 dwarfdump -v -v -F output extraction for
_Unwind_RaiseException:

fde section offset 692 0x000002b4 cie offset for fde: 696 0x000002b8
        0 DW_CFA_advance_loc 20  (5 * 4)
        1 DW_CFA_def_cfa_offset 2992
        4 DW_CFA_offset r31 -148  (37 * -4)
        6 DW_CFA_offset r30 -152  (38 * -4)
        8 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
       11 DW_CFA_advance_loc 4  (1 * 4)
       12 DW_CFA_def_cfa_register r31
       14 DW_CFA_offset r14 -216  (54 * -4)
       16 DW_CFA_offset r15 -212  (53 * -4)
       18 DW_CFA_offset r16 -208  (52 * -4)
       20 DW_CFA_offset r17 -204  (51 * -4)
       22 DW_CFA_offset r18 -200  (50 * -4)
       24 DW_CFA_offset r19 -196  (49 * -4)
       26 DW_CFA_offset r20 -192  (48 * -4)
       28 DW_CFA_offset r21 -188  (47 * -4)
       30 DW_CFA_offset r22 -184  (46 * -4)
       32 DW_CFA_offset r23 -180  (45 * -4)
       34 DW_CFA_offset r24 -176  (44 * -4)
       36 DW_CFA_offset r25 -172  (43 * -4)
       38 DW_CFA_offset r26 -168  (42 * -4)
       40 DW_CFA_offset r27 -164  (41 * -4)
       42 DW_CFA_offset r28 -160  (40 * -4)
       44 DW_CFA_offset r29 -156  (39 * -4)
       46 DW_CFA_offset r30 -152  (38 * -4)
       48 DW_CFA_offset r31 -148  (37 * -4)
       50 DW_CFA_offset r46 -144  (36 * -4)
       52 DW_CFA_offset r47 -136  (34 * -4)
       54 DW_CFA_offset r48 -128  (32 * -4)
       56 DW_CFA_offset r49 -120  (30 * -4)
       58 DW_CFA_offset r50 -112  (28 * -4)
       60 DW_CFA_offset r51 -104  (26 * -4)
       62 DW_CFA_offset r52 -96  (24 * -4)
       64 DW_CFA_offset r53 -88  (22 * -4)
       66 DW_CFA_offset r54 -80  (20 * -4)
       68 DW_CFA_offset r55 -72  (18 * -4)
       70 DW_CFA_offset r56 -64  (16 * -4)
       72 DW_CFA_offset r57 -56  (14 * -4)
       74 DW_CFA_offset r58 -48  (12 * -4)
       76 DW_CFA_offset r59 -40  (10 * -4)
       78 DW_CFA_offset r60 -32  (8 * -4)
       80 DW_CFA_offset r61 -24  (6 * -4)
       82 DW_CFA_offset r62 -16  (4 * -4)
       84 DW_CFA_offset r63 -8  (2 * -4)
       86 DW_CFA_nop
Quuxplusone commented 8 years ago

Is this is a FreeBSD-specific problem; or does this cause problems on Linux as well?

Quuxplusone commented 8 years ago
(In reply to comment #1)

> Is this is a FreeBSD-specific problem; or does this cause problems on Linux
> as well?

I do not have a Linux environment to investigate with at this time, much less
one based on clang 3.8.0. (Nor am I familiar with that kind of context at this
point.) I also do not have prior history to know about older clang's. (clang is
just now getting close to being able to do a FreeBSD "buildworld" that
generally works.)

My understanding is that all the processor specific variants of the C++ Itanium
ABI have the 4-scratch-register related requirements (and that the cr-in-call-
standard usage is SVR4 ABI material). So I expect that any environment based on
the C++ Itanium ABI would need to produce the scratch register related
.eh_frame material (and the matching code).

I do not know just what the compilation keys off of to decide to generate more
saves/restores than _Unwind_RaiseException actually has register usage for. But
clang did set up the full save/restore for r14-r31 and the floating point
registers despite _Unwind_RaiseException not using all of them. So something
got triggered somehow to save/restore more than normal.

As the .eh_frame material (and related code) comes out of clang 3.8.0 I do not
see how this would be "FreeBSD only" unless some patch/misconfiguration was
involved that made clang 3.8.0 a variant. But I do not know what to look for to
check on such issues on the FreeBSD side. With pointers to what to look for I
might be able to check or get others to check.
Quuxplusone commented 8 years ago
For the clang 3.8.0 mfcr/mtcr related save/restore code generation for
_Unwind_RaiseException it turns out that --even though the .eh_frame
information does not list it at all-- the actual code has cr saved and parts
restored:

Dump of assembler code for function _Unwind_RaiseException:
   0x41b2ab80 <+0>: mflr    r0
   0x41b2ab84 <+4>: stw     r31,-148(r1)
   0x41b2ab88 <+8>: stw     r30,-152(r1)
   0x41b2ab8c <+12>:    stw     r0,4(r1)
   0x41b2ab90 <+16>:    stwu    r1,-2992(r1)
   0x41b2ab94 <+20>:    mr      r31,r1
   0x41b2ab98 <+24>:    mfcr    r12
. . .
   0x41b2ac2c <+172>:   stw     r12,2772(r31)
. . .
   0x41b2ad84 <+516>:   lwz     r12,2772(r31)
. . .
   0x41b2add8 <+600>:   mtcrf   32,r12
. . .
   0x41b2ade8 <+616>:   mtcrf   16,r12
. . .
   0x41b2adf8 <+632>:   mtcrf   8,r12
. . .

Unfortunately the C++ Exception handling via interpreting .eh_frame will not
take advantage of the "2772(r31)" in any way because of the lack of such
.eh_frame information.

(Of course the main point of the submittal was the lack of the ABI's scratch
registers.)
Quuxplusone commented 8 years ago
[The following notes were written for a different bug report but part of the
content overlaps with this one too. I include it all for context but the last 4
paragraphs or so are what directly apply here.]

For __builtin_unwind_init() consequences (desired vs. actual) and
TargetFrameLowering::determineCalleeSaves behavior for at least powerpc and
powerpc64. . .

When something like a Frame Pointer is already in use in how the call standard
is implemented the code will already save the register and update it
independent of the "SavedRegs.set(Reg)" status for the register.

And, in fact, "SavedRegs.set(Reg)" on the Frame Pointer Register that gets the
save behavior after Frame Register update will replace the value that should
later be restored with the updated value that should not be what is later
restored. Such would not be a redundant store but instead an incorrect store.

That is what is actually happening for powerpc and powerpc64. (Any others?)

So for a call standard such as for powerpc64 where the Frame Pointer Register
(such as r31) is saved and updated before the (other) callee saved registers
are saved the Frame Pointer Register must not also be one of the
"SavedRegs.set(Reg)" registers.

This is true even when "MF.getMMI().callsUnwindInit()" [__builtin_unwind_init()
usage] indicates that overall all the potential callee-saved registers should
be saved and restored: The Frame Pointer already is saved and restored and
should not be "double saves/double restored".

The same is true for the plain powerpc code generation technique that is
currently in use (but that violates the SVR4 call/return ABI properties by
being more AIX-like that requires a red-zone on the low address side of the
stack). It is likely true for correct SVR4 ABI stack pointer handling/timing as
well.

For all I know there would be other machine architectures with the issue as
well.

Another "MF.getMMI().callsUnwindInit()" [__builtin_unwind_init() usage] related
issue seems to be that the .eh_frame information needs to span things like
Fields CR2, CR3, and CR4 of the condition register (CR) for powerpc being non-
volatile in SVR4's ABI: The FDE information does not seem to be recording
everything that is non-volatile for this ABI (and possibly others). (Likely the
whole CR is saved and the restore code would be the part to deal with selective
restore if the other fields of CR are to actually guarantee volatile status.
Volatile status does not need to be guaranteed from what I can tell.)

Yet another issue related to "MF.getMMI().callsUnwindInit()"
[__builtin_unwind_init() usage] for when the Itanium C++ Exception ABI is in
use is that there are 4 scratch registers in addition to the normal callee-
saves/restored registers. As stands for powerpc and powerpc64 no place to save
the scratch register values has been set up so that they can be restored for
the landing pad code --and the .eh_frame information makes no reference to the
scratch registers at all.

gcc generates save/restore code for r3, r4, r5, and r6 for the 4 scratch
registers for powerpc and powerpc64 --and explicitly records the information in
the FDE in the .eh_frame information. FreeBSD's system libgcc_s depends on
this, as an example. It gets a SEGV from trying to use address zero restore r3.

It would appear to me that TargetFrameLowering::determineCalleeSaves should
deal with the exception ABI's scratch registers (when there are such) but that
is does not explicitly do so for any machine architecture. (Is all the logic
elsewhere?)
Quuxplusone commented 8 years ago
PPCFrameLowering::determineCalleeSaves has no code for the Itanium-like C++
Exception ABI's 4 scratch registers [r3, r4, r5, r6] for
__builtin_unwind_init() contexts.

Apparently for __builtin_unwind_init() contexts (and any others?)

  // For 32-bit SVR4, allocate the nonvolatile CR spill slot iff the
  // function uses CR 2, 3, or 4.
  if (!isPPC64 && !isDarwinABI &&
      (SavedRegs.test(PPC::CR2) ||
       SavedRegs.test(PPC::CR3) ||
       SavedRegs.test(PPC::CR4))) {
    int FrameIdx = MFI->CreateFixedObject((uint64_t)4, (int64_t)-4, true);
    FI->setCRSpillFrameIndex(FrameIdx);
  }

in PPCFrameLowering::determineCalleeSaves [in
Target/PowerPC/PPCFrameLowering.cpp] is not sufficient for the .eh_frame
information to record the save/restore of (fields of) the CR.

This leaves the C++ Exception handing as not preserving these CR fields.

(Technically volatile-status need not be preserved but non-volatile-status
needs to be preserved: In other words the whole CR could be restored and still
comply with the SVR4 ABI.)

Side note:

PPCFrameLowering::determineCalleeSaves may be too late for the "double Frame
Pointer Register store" problem.
Quuxplusone commented 8 years ago
(In reply to comment #1)

Context for this note: powerpc64 example from a minor variation of
FreeBSD's head -r309179 and its clang 3.9.0 . (Before I'd used
powerpc but for now powerpc64 is easier to deal with.) This
clang has the powerpc and powerpc64 patches that have been
explicitly reported as having been made.

Later below is from a dwarfdump of the _Unwind_RaiseException code.
One thing it still shows is a lack of records for the 4 registers
r3, r4, r5, r6 that are used as the scratch registers from the ABI.
Because of this _Unwind_SetGR and _Unwind_GetGR SIGSEGV when the
scratch registers are what is being accessed in the exception
handling code.

By contrast there now are r70 references that might be for the
value from mfcr so some points of 26844 may have been addressed.

[Modern FreeBSD also handles more registers from modern powerpc64
variants now. The below seems to have additional material for
that --but I've not matched things up 1-to-1 or the like.]

 fde section offset 696 0x000002b8 cie offset for fde: 700 0x000002bc
         0 DW_CFA_advance_loc 24  (6 * 4)
         1 DW_CFA_def_cfa_offset 5840
         4 DW_CFA_offset r31 -152  (19 * -8)
         6 DW_CFA_offset_extended_sf r65 16  (-2 * -8)
         9 DW_CFA_advance_loc 4  (1 * 4)
        10 DW_CFA_def_cfa_register r31
        12 DW_CFA_offset r14 -288  (36 * -8)
        14 DW_CFA_offset r15 -280  (35 * -8)
        16 DW_CFA_offset r16 -272  (34 * -8)
        18 DW_CFA_offset r17 -264  (33 * -8)
        20 DW_CFA_offset r18 -256  (32 * -8)
        22 DW_CFA_offset r19 -248  (31 * -8)
        24 DW_CFA_offset r20 -240  (30 * -8)
        26 DW_CFA_offset r21 -232  (29 * -8)
        28 DW_CFA_offset r22 -224  (28 * -8)
        30 DW_CFA_offset r23 -216  (27 * -8)
        32 DW_CFA_offset r24 -208  (26 * -8)
        34 DW_CFA_offset r25 -200  (25 * -8)
        36 DW_CFA_offset r26 -192  (24 * -8)
        38 DW_CFA_offset r27 -184  (23 * -8)
        40 DW_CFA_offset r28 -176  (22 * -8)
        42 DW_CFA_offset r29 -168  (21 * -8)
        44 DW_CFA_offset r30 -160  (20 * -8)
        46 DW_CFA_offset r46 -144  (18 * -8)
        48 DW_CFA_offset r47 -136  (17 * -8)
        50 DW_CFA_offset r48 -128  (16 * -8)
        52 DW_CFA_offset r49 -120  (15 * -8)
        54 DW_CFA_offset r50 -112  (14 * -8)
        56 DW_CFA_offset r51 -104  (13 * -8)
        58 DW_CFA_offset r52 -96  (12 * -8)
        60 DW_CFA_offset r53 -88  (11 * -8)
        62 DW_CFA_offset r54 -80  (10 * -8)
        64 DW_CFA_offset r55 -72  (9 * -8)
        66 DW_CFA_offset r56 -64  (8 * -8)
        68 DW_CFA_offset r57 -56  (7 * -8)
        70 DW_CFA_offset r58 -48  (6 * -8)
        72 DW_CFA_offset r59 -40  (5 * -8)
        74 DW_CFA_offset r60 -32  (4 * -8)
        76 DW_CFA_offset r61 -24  (3 * -8)
        78 DW_CFA_offset r62 -16  (2 * -8)
        80 DW_CFA_offset r63 -8  (1 * -8)
        82 DW_CFA_offset_extended_sf r70 8  (-1 * -8)
        85 DW_CFA_offset_extended_sf r70 8  (-1 * -8)
        88 DW_CFA_offset_extended_sf r70 8  (-1 * -8)
        91 DW_CFA_offset_extended r97 -496  (62 * -8)
        94 DW_CFA_offset_extended r98 -480  (60 * -8)
        97 DW_CFA_offset_extended r99 -464  (58 * -8)
        100 DW_CFA_offset_extended r100 -448  (56 * -8)
        103 DW_CFA_offset_extended r101 -432  (54 * -8)
        106 DW_CFA_offset_extended r102 -416  (52 * -8)
        109 DW_CFA_offset_extended r103 -400  (50 * -8)
        112 DW_CFA_offset_extended r104 -384  (48 * -8)
        115 DW_CFA_offset_extended r105 -368  (46 * -8)
        118 DW_CFA_offset_extended r106 -352  (44 * -8)
        121 DW_CFA_offset_extended r107 -336  (42 * -8)
        124 DW_CFA_offset_extended r108 -320  (40 * -8)
Quuxplusone commented 8 years ago
(In reply to comment #6)

I include a short, simple program showing a C++ exception
failure, not necessarily for a scratch register reference
during the handling.

#include <exception>

int main(void)
{
    try { throw std::exception(); }
    catch (std::exception& e) {}
    return 0;
}

Which gets:

Program terminated with signal SIGABRT, Aborted.
#0  0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7
(gdb) bt
#0  0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7
#1  0x00000000502f8818 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x00000000502f8748 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#3  0x00000000501c9cbc in _Unwind_GetGR (context=<optimized out>, index=65) at
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:180
#4  uw_update_context_1 (context=<optimized out>, fs=<optimized out>) at
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1353
#5  0x00000000501c78b0 in uw_init_context_1 (context=0xffffffffffffd1e0,
outer_cfa=0xffffffffffffd940, outer_ra=0x50179ea0 <__cxa_throw(void*,
std::type_info*, void (*)(void*))+248>)
    at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1442
#6  0x00000000501c7418 in _Unwind_RaiseException (exc=<optimized out>) at
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:92
#7  0x0000000050179ea0 in throw_exception (ex=<optimized out>) at
/usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:774
#8  __cxa_throw (thrown_exception=<optimized out>, tinfo=<optimized out>,
dest=<optimized out>) at
/usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:801
#9  0x0000000010000cf0 in .main ()
(gdb) up 3
#3  0x00000000501c9cbc in _Unwind_GetGR (context=<optimized out>, index=65) at
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:180
180       gcc_assert (size == sizeof(_Unwind_Word));
(gdb) list
175   /* This will segfault if the register hasn't been saved.  */
176   if (size == sizeof(_Unwind_Ptr))
177     return * (_Unwind_Ptr *) ptr;
178   else
179     {
180       gcc_assert (size == sizeof(_Unwind_Word));
181       return * (_Unwind_Word *) ptr;
182     }
183 }
184
Quuxplusone commented 8 years ago
(In reply to comment #1)
> Is this is a FreeBSD-specific problem; or does this cause problems on Linux
> as well?

I set up a Ubuntu Mate 16.10 powerpc64 environment and used dwarfdump on
its /lib64/libgcc_s.so.1 . (Of course it was built with gcc compilers,
not clang.) It shows things like. . .

The 4 scratch registers are explicitly listed for the appropriate
exception handling routines that allow use of them, for example:

        83 DW_CFA_offset r3 -320  (40 * -8)
        85 DW_CFA_offset r4 -312  (39 * -8)
        87 DW_CFA_offset r5 -304  (38 * -8)
        89 DW_CFA_offset r6 -296  (37 * -8)

(No individual DW_CFA_restore for any of those 4.)

It also shows r70 use.

The full sequence for one of the relevant routines is:

 fde section offset 1544 0x00000608 cie offset for fde: 1548 0x0000060c
         0 DW_CFA_advance_loc 8  (2 * 4)
         1 DW_CFA_def_cfa_offset 5648
         4 DW_CFA_register r65 = r0
         7 DW_CFA_advance_loc 168  (42 * 4)
         8 DW_CFA_offset_extended_sf r65 16  (-2 * -8)
        11 DW_CFA_offset r46 -144  (18 * -8)
        13 DW_CFA_offset r47 -136  (17 * -8)
        15 DW_CFA_offset r48 -128  (16 * -8)
        17 DW_CFA_offset r49 -120  (15 * -8)
        19 DW_CFA_offset r50 -112  (14 * -8)
        21 DW_CFA_offset r51 -104  (13 * -8)
        23 DW_CFA_offset r52 -96  (12 * -8)
        25 DW_CFA_offset r53 -88  (11 * -8)
        27 DW_CFA_offset r54 -80  (10 * -8)
        29 DW_CFA_offset r55 -72  (9 * -8)
        31 DW_CFA_offset r56 -64  (8 * -8)
        33 DW_CFA_offset r57 -56  (7 * -8)
        35 DW_CFA_offset r58 -48  (6 * -8)
        37 DW_CFA_offset r59 -40  (5 * -8)
        39 DW_CFA_offset r60 -32  (4 * -8)
        41 DW_CFA_offset r61 -24  (3 * -8)
        43 DW_CFA_offset r62 -16  (2 * -8)
        45 DW_CFA_offset r63 -8  (1 * -8)
        47 DW_CFA_offset r14 -288  (36 * -8)
        49 DW_CFA_offset r15 -280  (35 * -8)
        51 DW_CFA_offset r16 -272  (34 * -8)
        53 DW_CFA_offset r17 -264  (33 * -8)
        55 DW_CFA_offset r18 -256  (32 * -8)
        57 DW_CFA_offset r19 -248  (31 * -8)
        59 DW_CFA_offset r20 -240  (30 * -8)
        61 DW_CFA_offset r21 -232  (29 * -8)
        63 DW_CFA_offset r22 -224  (28 * -8)
        65 DW_CFA_offset r23 -216  (27 * -8)
        67 DW_CFA_offset r24 -208  (26 * -8)
        69 DW_CFA_offset r25 -200  (25 * -8)
        71 DW_CFA_offset r26 -192  (24 * -8)
        73 DW_CFA_offset r27 -184  (23 * -8)
        75 DW_CFA_offset r28 -176  (22 * -8)
        77 DW_CFA_offset r29 -168  (21 * -8)
        79 DW_CFA_offset r30 -160  (20 * -8)
        81 DW_CFA_offset r31 -152  (19 * -8)
        83 DW_CFA_offset r3 -320  (40 * -8)
        85 DW_CFA_offset r4 -312  (39 * -8)
        87 DW_CFA_offset r5 -304  (38 * -8)
        89 DW_CFA_offset r6 -296  (37 * -8)
        91 DW_CFA_advance_loc 28  (7 * 4)
        92 DW_CFA_offset_extended_sf r2 40  (-5 * -8)
        95 DW_CFA_offset_extended_sf r70 8  (-1 * -8)
        98 DW_CFA_advance_loc 236  (59 * 4)
        99 DW_CFA_remember_state
        100 DW_CFA_restore_extended r65
        102 DW_CFA_advance_loc 28  (7 * 4)
        103 DW_CFA_restore_extended r70
        105 DW_CFA_advance_loc 112  (28 * 4)
        106 DW_CFA_restore r63
        107 DW_CFA_restore r62
        108 DW_CFA_restore r61
        109 DW_CFA_restore r60
        110 DW_CFA_restore r59
        111 DW_CFA_restore r58
        112 DW_CFA_restore r57
        113 DW_CFA_restore r56
        114 DW_CFA_restore r55
        115 DW_CFA_restore r54
        116 DW_CFA_restore r53
        117 DW_CFA_restore r52
        118 DW_CFA_restore r51
        119 DW_CFA_restore r50
        120 DW_CFA_restore r49
        121 DW_CFA_restore r48
        122 DW_CFA_restore r47
        123 DW_CFA_restore r46
        124 DW_CFA_restore r31
        125 DW_CFA_restore r30
        126 DW_CFA_restore r29
        127 DW_CFA_restore r28
        128 DW_CFA_restore r27
        129 DW_CFA_restore r26
        130 DW_CFA_restore r25
        131 DW_CFA_restore r24
        132 DW_CFA_restore r23
        133 DW_CFA_restore r22
        134 DW_CFA_restore r21
        135 DW_CFA_restore r20
        136 DW_CFA_restore r19
        137 DW_CFA_restore r18
        138 DW_CFA_restore r17
        139 DW_CFA_restore r16
        140 DW_CFA_restore r15
        141 DW_CFA_restore r14
        142 DW_CFA_def_cfa_offset 0
        144 DW_CFA_advance_loc 12  (3 * 4)
        145 DW_CFA_restore_state
        146 DW_CFA_nop
        147 DW_CFA_nop
        148 DW_CFA_nop
        149 DW_CFA_nop
        150 DW_CFA_nop

(I'm not aware of any linux that is clang built for powerpc64. I am aware
of one for amd64 these days:
https://sourceforge.net/projects/openmandriva/files/release/3.0/ .)
Quuxplusone commented 8 years ago
A note about the context for this report:

Summary: TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc are still based on
the gcc unwind library in FreeBSD 12 (so far) and in 11. My reports are
about that context in the details presented. Avoid comparing dwarfdump
output from llvm's libunwind unless you know what differences to expect.

FreeBSD 11 and 12 have a control for picking which unwind library is built
in buildworld:

WITHOUT_LLVM_LIBUNWIND=
(meaning the historical gcc library is used)

WITH_LLVM_LIBUNWIND=

In FreeBSD 11 only TARGET_ARCH=aarch64 uses WITH_LLVM_LIBUNWIND= by
default. (In fact aarch64 requires it as I understand.)

In head (12) more TARGET_ARCH's have WITH_LLVM_LIBUNWIND= as the
default at this stage. Other than aarch64 they allow picking which
way to build because they had prior gcc based library support.

powerpc64 and powerpc still have WITHOUT_LLVM_LIBUNWIND= as the
default in FreeBSD 12. I asked and the intent for 12 is to be able
to use either unwind library for these. It is not clear to me if
the default will survive to 12-CURRENT (which is a long ways off).

But FreeBSD 11 will not switch the default for powerpc64 or powerpc
because of actively supporting gcc 4.2.1 based builds, with optional
clang support possibly showing up via "Move From Current (12)"
activity since clang 3.9.0 is scheduled to be MFC'd.

If one accidentally compares dwarfdump materials from FreeBSD's built
with opposite selections for libunwind it can lead to confusions that
might leave one expecting some defect(s) for unexplained differences.

Everything that I've reported here for dwarf C++ exception handling
information and its use is for the gcc unwind library context.
Quuxplusone commented 7 years ago
FreeBSD status update. . .

stable/11 from FreeBSD has for a time had as its system
compiler its variant of 3.9.1 . So the FreeBSD 11.1
release will have the powerpc64 and powerpc fixes that
have been applied plus any others that they merge in
before then. And so on for 11.x (1<x).

(Unfortunately powerpc still has stack handling problems
relative to r30 usage [26519] and floating point code and
C++ exception handling does not work for powerpc or
powerpc64.)

I'd guess that head (12) and stable/11 would continue to
eventually pick up powerpc and powerpc64 fixes that fit
in clang 3.9.1 easily.

I'll also note that projects/clang400-import has started
so at least the eventual 12.x releases will likely be based
on clang 4.0.x . I've no clue if stable/11 might pick up
such at some point.
Quuxplusone commented 7 years ago
(In reply to Mark Millard from comment #10)
> FreeBSD status update. . .

FYI: FreeBSD has merged clang/llvm 4.0 into stable/11 (-r316423)
in addition to head (12) some time ago. release/11.1.0 will be
clang/llvm 4.0 based.

TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc still does not
handle thrown exceptions in this clang version.

TARGET_ARCH=powerpc still has the R30 vs. tack handling vs.
floating point code problem: restored R30 and then generates
floating point cleanup code that expects R30 to not have
changed yet.
Quuxplusone commented 7 years ago
(In reply to Mark Millard from comment #7)
> (In reply to comment #6)

[Context: FreeBSD head -r317820 system clang
with a couple of patches I've been given to
test for folks.]

Admittedly the following is based in part on
a couple of patches that I'd been given to
test for folks but I figure the following
change of status details is likely important
relative to various specifics that I wrote
originally:

For:

#include <exception>

int main(void)
{
    try { throw std::exception(); }
    catch (std::exception& e) {}
    return 0;
}

I now get a different backtrace when it fails. . .

Program received signal SIGSEGV, Segmentation fault.
0x41b355d0 in _Unwind_SetGR (context=<value optimized out>, index=<value
optimized out>, val=1105281072) at unwind-dw2-fde.h:162
162 {
Current language:  auto; currently minimal
(gdb) bt
#0  0x41b355d0 in _Unwind_SetGR (context=<value optimized out>, index=<value
optimized out>, val=1105281072) at unwind-dw2-fde.h:162
#1  0x4192e370 in __gxx_personality_v0 (version=<value optimized out>,
actions=<value optimized out>, exceptionObject=0x41e14030, context=0xffffd590)
at /usr/src/contrib/libcxxrt/exception.cc:1203
#2  0x41b36234 in _Unwind_RaiseException_Phase2 (exc=<value optimized out>,
context=<value optimized out>) at unwind.inc:66
#3  0x41b35e10 in _Unwind_RaiseException (exc=0xffffd590) at unwind.inc:135
#4  0x4192d870 in __cxa_throw (thrown_exception=<value optimized out>,
tinfo=<value optimized out>, dest=<value optimized out>) at
/usr/src/contrib/libcxxrt/exception.cc:774
#5  0x01800954 in main () at exception_test.cpp:5
Quuxplusone commented 7 years ago
An update based on looking at what
FreeBSD's clang 5 is doing and
not doing compared to FreeBSD's
devel/powerpc64-gcc port.

It is still broken and thrown C++
exceptions attempt to dereference
a NULL pointer and fail.

Overall the following pair are supposed to
control the behavior for the save/restores
relative to special C++ exception handling
routines in FreeBSD library code (such
as its libgcc_s.so.1 ):

__builtin_unwind_init()
__builtin_eh_return(offset, handler)

The presence of the 2nd one is supposed
to include causing the scratch register
handling for the ABI involved as I
understand. This is not happening.
That is what leads to the lack of
information that causes the NULL
that ends up being used as I understand.

For powerpc64 (and powerpc) the
scratch registers are r2-r6 as
visible in what powerpc64-gcc
generates. The first attempted
_Unwind_SetGR during a throw is
for r3 and it fails.

I found the following:

/usr/src/contrib/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp:  // Functions
which call __builtin_unwind_init get all their registers saved.

More completely:

# grep -ir __builtin_unwind_init /usr/src/contrib/* | more
/usr/src/contrib/gcc/config/ia64/unwind-ia64.c:    __builtin_unwind_init();
\
/usr/src/contrib/gcc/unwind-dw2.c:      __builtin_unwind_init ();
\
/usr/src/contrib/llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp:  // Functions
which call __builtin_unwind_init get all their registers saved.
/usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp:  case
Builtin::BI__builtin_unwind_init: {
/usr/src/contrib/llvm/tools/clang/include/clang/Basic/Builtins.def:BUILTIN(__builtin_unwind_init,
"v", "")
/usr/src/contrib/llvm/include/llvm/IR/Intrinsics.td:// __builtin_unwind_init is
an undocumented GCC intrinsic that causes all
/usr/src/contrib/llvm/include/llvm/IR/Intrinsics.td:
GCCBuiltin<"__builtin_unwind_init">;

and quoting more of that last file:

// __builtin_unwind_init is an undocumented GCC intrinsic that causes all
// callee-saved registers to be saved and restored (regardless of whether they
// are used) in the calling function. It is used by libgcc_eh.
def int_eh_unwind_init: Intrinsic<[]>,
                       GCCBuiltin<"__builtin_unwind_init">;

That comment is incomplete by not mentioning the scratch
registers defined by the exception handling ABI being
saved and restored in at least some contexts [r2-r6
for powerpc64 and powerpc] --matching the odd behavior.

void TargetFrameLowering::determineCalleeSaves(MachineFunction &MF,
                                              BitVector &SavedRegs,
                                              RegScavenger *RS) const {
. . .
 // Functions which call __builtin_unwind_init get all their registers saved.
 bool CallsUnwindInit = MF.callsUnwindInit();
 const MachineRegisterInfo &MRI = MF.getRegInfo();
 for (unsigned i = 0; CSRegs[i]; ++i) {
   unsigned Reg = CSRegs[i];
   if (CallsUnwindInit || MRI.isPhysRegModified(Reg))
     SavedRegs.set(Reg);
 }
}

There is no powerpc64 or powerpc equivalent
to the gcc code that I found for that I found
for __builtin_eh_return. But there is mips code
that does something special (not necessarily to
what powerpc64 or powerpc should do):

/usr/src/contrib/llvm/lib/Target/Mips/MipsISelLowering.cpp

with:

// An EH_RETURN is the result of lowering llvm.eh.return which in turn is
// generated from __builtin_eh_return (offset, handler)
// The effect of this is to adjust the stack pointer by "offset"
// and then branch to "handler".
SDValue MipsTargetLowering::lowerEH_RETURN(SDValue Op, SelectionDAG &DAG)
                                                                    const {
 MachineFunction &MF = DAG.getMachineFunction();
 MipsFunctionInfo *MipsFI = MF.getInfo<MipsFunctionInfo>();

 MipsFI->setCallsEhReturn();
 SDValue Chain     = Op.getOperand(0);
 SDValue Offset    = Op.getOperand(1);
 SDValue Handler   = Op.getOperand(2);
 SDLoc DL(Op);
 EVT Ty = ABI.IsN64() ? MVT::i64 : MVT::i32;

 // Store stack offset in V1, store jump target in V0. Glue CopyToReg and
 // EH_RETURN nodes, so that instructions are emitted back-to-back.
 unsigned OffsetReg = ABI.IsN64() ? Mips::V1_64 : Mips::V1;
 unsigned AddrReg = ABI.IsN64() ? Mips::V0_64 : Mips::V0;
 Chain = DAG.getCopyToReg(Chain, DL, OffsetReg, Offset, SDValue());
 Chain = DAG.getCopyToReg(Chain, DL, AddrReg, Handler, Chain.getValue(1));
 return DAG.getNode(MipsISD::EH_RETURN, DL, MVT::Other, Chain,
                    DAG.getRegister(OffsetReg, Ty),
                    DAG.getRegister(AddrReg, getPointerTy(MF.getDataLayout())),
                    Chain.getValue(1));
}

Hexagon also has code, as does X86.

But nothing special for powerpc64 or powerpc.
As I understand what I've seen there should
be code tied to __builtin_eh_return use that
causes r2-r6 save/restore as well.
Quuxplusone commented 7 years ago
The following short, simple program is intended
to be inspected with dwarfdump and objddump
and the like, not to be run. It allows basic
inspecting for compatibility between the likes
of devel/powerpc64-gcc
(/usr/local/bin/powerpc64-unknown-freebsd12.0-g++)
based compiles and clang++ based compiles
for DW_CFA_<?> material for exception handling.

-O2 notes are about powerpc64-gcc's handling.

Being self contained does limit what it can show
for r2 handling in the DW_CFA_<?> sequences. It
does show r3-r6 scratch register handling. More
than this test case is needed overall.

# more normal_and_special_save_restore.cpp
void normal_save_restore()
{
}
void with_unwind_init_save_restore()
{
   __builtin_unwind_init();
}
void with_eh_return_save_restore()
{
    long offset = 0;
    void *handler = 0;

   __builtin_eh_return(offset,handler);
}
void with_unwind_init_and_eh_return_save_restore()
{
    long offset = 0;
    void *handler = 0;

   __builtin_unwind_init();
   __builtin_eh_return(offset,handler);
}

void use_normal_save_restore()
{
   normal_save_restore(); // -O2 inlined the behavior
}
void use_with_unwind_init_save_restore()
{
   with_unwind_init_save_restore(); // -O2 inlined the behavior
}
void use_with_eh_return_save_restore()
{
   with_eh_return_save_restore();
}
void use_with_unwind_init_and_eh_return_save_restore()
{
   with_unwind_init_and_eh_return_save_restore();
}

volatile int value;

int main(void)
{
   if (0==value) use_normal_save_restore();                         // -O2 inlined the behavior
   if (1==value) use_with_unwind_init_save_restore();               // -O2 inlined the behavior
   if (2==value) use_with_eh_return_save_restore();                 // -O2 actually calls with_eh_return_save_restore()
   if (3==value) use_with_unwind_init_and_eh_return_save_restore(); // -O2 actually calls with_unwind_init_and_eh_return_save_restore()
   return 0;
}

# /usr/local/bin/powerpc64-unknown-freebsd12.0-g++ -std=c++11 -g -O2 -c -o
normal_and_special_save_restore.gcc6 normal_and_special_save_restore.cpp

Note: dwarfdump -v -v -F normal_and_special_save_restore.gcc6
shows some storage for which no explicit store or
access was generated (or needed). This was for the
scratch registers.

For example: with_eh_return_save_restore lists:

        0 DW_CFA_advance_loc 16  (4 * 4)
        1 DW_CFA_offset r3 -32  (4 * -8)
        3 DW_CFA_offset r4 -24  (3 * -8)
        5 DW_CFA_offset r5 -16  (2 * -8)
        7 DW_CFA_offset r6 -8  (1 * -8)
        9 DW_CFA_advance_loc 8  (2 * 4)
       10 DW_CFA_offset_extended_sf r2 40  (-5 * -8)
       13 DW_CFA_nop
       14 DW_CFA_nop

but there are no explicit saves or restores
of r3-r6 in the code but space is reserved
on the stack for them.

Having arguments to the routines would cause
the save/restores to be in the code.

# clang++ -std=c++11 -g -O2 -o normal_and_special_save_restore.clang
normal_and_special_save_restore.cpp

dwarfdump -v -v -F normal_and_special_save_restore.clang
does not work: empty output. Trying just:

dwarfdump normal_and_special_save_restore.clang

shows .debug_frame information and more but
not .eh_frame output: it appears that clang
omitted it. This was also true with
-B /usr/local/powerpc64-freebsd/bin/ being
in use.
Quuxplusone commented 7 years ago

Summary - __builtin_eh_return is not implemented at all in LLVM for neither PPC nor PPC64.

I have a WIP patch at http://vlakno.cz/~rdivacky/ppc64.eh_return.patch if anyone wants to take a look and finish it.

Quuxplusone commented 5 years ago
I tried FreeBSD's clang from head -r344896 (so: 8.0.0.rc3) and
there still is nothing analogous to:

        83 DW_CFA_offset r3 -320  (40 * -8)
        85 DW_CFA_offset r4 -312  (39 * -8)
        87 DW_CFA_offset r5 -304  (38 * -8)
        89 DW_CFA_offset r6 -296  (37 * -8)

for the scratch registers Ir3-r6) in routines that use
__builtin_eh_return .

As such, this leads to segmentation faults when the
CFA material is interpreted and tries to put the scratch
registers to use.

(I had dwarfdump report for the output of a different
compiler to make the list above.)
Quuxplusone commented 5 years ago
To avoid confusion: my ".rc3" typos were doubly wrong:

1) the notation really has no "c"
2) the .r3 notation is from the devel/llvm80 port

but I was speaking of the system compiler (installed as
cc and c++). For -r344896 it reports:

# cc -v
FreeBSD clang version 8.0.0 (branches/release_80 355313) (based on LLVM 8.0.0)
Target: x86_64-unknown-freebsd13.0
Thread model: posix
InstalledDir: /usr/bin

Notes:

Absent the scratch register DW_CFA_offset information,
I still do not build FreeBSD itself via clang. This is
because I want thrown exceptions to work.

I do build ports and such under FreeBSD via system-clang
when the port defaults to the system compiler. For some
ports I enable using the system compiler, basically other
compilers.

FreeBSD for powerpc64 is still officially not clang based.
My context is experimental.
Quuxplusone commented 4 years ago
FreeBSD head (13) is now llvm/clang 9.0.1 based and looking at
what it produces for powerpc64 shows that:

g++9 has r3-r6 scratch register information in the
with_unwind_init_save_restore and
with_unwind_init_and_eh_return_save_restore example
code, even when based on FreeBSD headers and libraries:

g++9 -std=c++17 -std=c++17 -Wno-psabi -nostdinc -nostdinc++ -
I/usr/include/c++/v1 -I/usr/include -nodefaultlibs -lc++ -lcxxrt -lthr -lm -lc -
lgcc_s -o normal_and_special_save_restore.gcc
normal_and_special_save_restore.cpp

leads to, for example, normal_and_special_save_restore.gcc
having:

<    5><0x10000c6c:0x10000d04><><cie offset 0x00000114::cie index     0><fde
offset 0x00000110 length: 0x00000030>
       <eh aug data len 0x0>
        0x10000c6c: <off cfa=00(r1) >
        0x10000c70: <off cfa=112(r1) >
        0x10000c84: <off cfa=112(r1) > <off r3=-40(cfa) > <off r4=-32(cfa) > <off r5=-24(cfa) > <off r6=-16(cfa) > <off r31=-8(cfa) >
        0x10000ca0: <off cfa=112(r31) > <off r2=24(cfa) > <off r3=-40(cfa) > <off r4=-32(cfa) > <off r5=-24(cfa) > <off r6=-16(cfa) > <off r31=-8(cfa) > <off r70=-64(cfa) > <off r71=-56(cfa) > <off r7
2=-48(cfa) >
        0x10000cc0: <off cfa=00(r1) > <off r2=24(cfa) > <off r3=-40(cfa) > <off r4=-32(cfa) > <off r5=-24(cfa) > <off r6=-16(cfa) > <off r31=-8(cfa) > <off r70=-64(cfa) > <off r71=-56(cfa) > <off r72=
-48(cfa) >
. . .

C++ (clang++) does not have any r3, r4, r5, or r6 (scratch
register) information in its dwarf information for the
example.

However, with the use of llvm's libunwind instead of the old
library code in head for powerpc64, the access(es) that expected
to find scratch register descriptions do not happen and there
is no SIGSEGV.

So, if there is some sort of problem here now, such as a clang
vs. gcc ABI mismatch that prevents using one of the compilers,
it is a different problem than was originally being reported
as far as FeeBSD is concerned. (The original technical detail
is still true but the library context changed.)

So a form of "overcome by events". As far as I can tell INVALID
is the best-fit for the status, so that is what I'm selecting.