Clang 4.0.0's "Target: powerpc-unknown-freebsd11.0" code generation is violating the SVR4 ABI (SEGV can result)

Quuxplusone commented 8 years ago


Bugzilla Link	PR26519
Status	RESOLVED FIXED
Importance	P normal
Reported by	Mark Millard (marklmi26-fbsd@yahoo.com)
Reported on	2016-02-07 14:35:42 -0800
Last modified on	2017-11-12 11:42:57 -0800
Version	4.0
Hardware	Other FreeBSD
CC	brad@comstyle.com, chmeeedalf@gmail.com, emaste@freebsd.org, kparzysz@quicinc.com, llvm-bugs@lists.llvm.org, rdivacky@freebsd.org
Fixed by commit(s)
Attachments	`perl_numeric_powerpc.txt.zip` (65626 bytes, application/zip) `0001-PPC-Properly-update-register-save-area-offsets-release-40.patch` (2159 bytes, text/plain)
Blocks	PR25780
Blocked by
See also

Comparing clang 3.8.0 (via FreeBSD's projects/clang380-import svn) generated
code for TARGET_ARCH=powerpc (32-bit) to gcc 4.2.1 generated code. . .

clang 3.8.0 based Str_Match preamble (from make):

0x181a4a8 <Str_Match>:  mflr    r0
0x181a4ac <Str_Match+4>:    stw     r31,-4(r1) # Clang's frame pointer (r31)
                                                   # saved before stack pointer changed.
0x181a4b0 <Str_Match+8>:    stw     r0,4(r1)   # lr saved before stack pointer
changed.
0x181a4b4 <Str_Match+12>:   stwu    r1,-32(r1) # Stack pointer finally saved and
                                                   # changed.
0x181a4b8 <Str_Match+16>:   mr      r31,r1     # r31 is the frame pointer under
clang.
0x181a4bc <Str_Match+20>:   stw     r30,24(r31)

gcc 4.2.1 based Str_Match preamble:

0x1819cb8 <Str_Match>:  mflr    r0
0x1819cbc <Str_Match+4>:    stwu    r1,-32(r1) # Stack pointer saved and changed
first.
0x1819cc0 <Str_Match+8>:    stw     r31,28(r1) # r31 saved after stack pointer
changed.
0x1819cc4 <Str_Match+12>:   mr      r31,r3     # gcc 4.2.1 does not reserve
                                                   # r31 for use as a frame pointer.
0x1819cc8 <Str_Match+16>:   stw     r30,24(r1)
0x1819ccc <Str_Match+20>:   stw     r0,36(r1)  # lr saved after stack pointer
changed.

Picking a different example for postamble code, showing just clang 3.8.0's code:

0x1801b8c <Buf_AddBytes+104>:   lwz     r30,24(r31)
0x1801b90 <Buf_AddBytes+108>:   lwz     r29,20(r31)
0x1801b94 <Buf_AddBytes+112>:   lwz     r28,16(r31)
0x1801b98 <Buf_AddBytes+116>:   lwz     r27,12(r31)
0x1801b9c <Buf_AddBytes+120>:   lwz     r26,8(r31)
0x1801ba0 <Buf_AddBytes+124>:   addi    r1,r1,32   # Stack pointer adjusted first
0x1801ba4 <Buf_AddBytes+128>:   lwz     r0,4(r1)
0x1801ba8 <Buf_AddBytes+132>:   lwz     r31,-4(r1) # Then Frame Pointer load
happens
                                                   # "outside" the new stack range.
0x1801bac <Buf_AddBytes+136>:   mtlr    r0
0x1801bb0 <Buf_AddBytes+140>:   blr

In other words: clang 3.8.0's generated 32-bit powerpc code is based on there
being a safe scratch area below the stack ("below" by memory address). So
similar to the 224 byte "red zone" area that 32-bit AIX powerpc and 32-bit
Darwin powerpc use.

I'm told by Nathan Whithorn that "the 32-bit ELF ABI does not require any such
red zone" and so that clang 3.8.0 is violating the ABI that is supposed to be
involved.

I do not have specific document or section references (or web links) to list
for the ABI details at this time. I'm just reporting what I'm told by FreeBSD
folks.

I used "make" code as the example above because something like "make -j 6
buildworld" uses signal delivery extensively (SIGCHLD) and such a build its
gets a SEGV in a make process within the 1st few minutes (on a "Quad core G5
PowerMac" using a FreeBSD powerpc 32-bit installation). The signal delivery is
sometimes replacing the value at "-4(r1)" in the above code before it is loaded
back into r31 (the clang 3.8.0 framepointer register for powerpc as it is
currently generating code). The FreeBSD signal delivery for 32-bit powerpc does
not have/use a "red zone" on the smaller-address side of the stack.

Quuxplusone commented 8 years ago

FYI, additional ABI issue possibility:

On FreeBSD 11.0-CURRENT (well, projects/clang380-import) when I try "clang++ -
std=c++11 -dM -E just_main.cpp" I get. . .

#define __BIGGEST_ALIGNMENT__ 8
. . .
#define __NATURAL_ALIGNMENT__ 1

on both TARGET_ARCH=powerpc and TARGET_ARCH=powerpc64 contexts.

But for the special port devel/powerpc64-gcc used for modern cross compiles of
FreeBSD ( /usr/local/bin/powerpc64-portbld-freebsd11.0-g++ -std=c++11 -dM -E
just_main.cpp ) I get:

#define __BIGGEST_ALIGNMENT__ 16

(Natural is not referenced. There is no such special port for 32-bit powerpc to
see its output.)

So there may be an ABI alignment mismatch someplace as well unless the ABI has
some optional-status alignment rules.

Without a ABI reference document I'm unsure which __BIGGEST_ALIGNMENT__ would
be correct for the FreeBSD powerpc ABI (if either one is). (Similarly for
powerpc64.)

Side notes:

powerpc64-portbld-freebsd11.0-g++ also reports:

#define __CMODEL_MEDIUM__ 1
. . .
#define __cpp_rvalue_reference 200610

But clang++ 3.8.0 reports nothing analogous to the first and:

#define __cpp_rvalue_references 200610

(gcc and clang have spelling differences here.)

Quuxplusone commented 8 years ago

"FreeBSD's" Justin Hibbits added a note in the FreeBSD bug entry (206990) that
I repeat here:

There is no provision in the ABI for a redzone in 32-bit powerpc.  LLVM is
broken for 32-bit PowerPC regarding this, and there are comments in the source
code to this regard, to the effect:

(PPCFrameLowering.cpp):
    // FIXME: On PPC32 SVR4, we must not spill before claiming the stackframe.

If a signal interrupts the thread at the precise wrong time (when creating the
stack frame, but before adjusting %r1), Bad Things will happen.

Quuxplusone commented 8 years ago

It appears that I originally forgot to set the Component for my submittal. Looking around I estimate that LLVM Codegen is the closest category: PPCFrameLowering.cpp is not under tools/clang/ at all so C++ would seem to be over specific.

Quuxplusone commented 8 years ago

(In reply to comment #1)

Just a note on __cpp_rvalue_reference vs. __cpp_rvalue_references :

> Side notes:
>
> powerpc64-portbld-freebsd11.0-g++ also reports:
>
> #define __CMODEL_MEDIUM__ 1
> . . .
> #define __cpp_rvalue_reference 200610
>
> But clang++ 3.8.0 reports nothing analogous to the first and:
>
> #define __cpp_rvalue_references 200610
>
> (gcc and clang have spelling differences here.)

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00175.html says:

> as PR71214 points out gcc uses a wrong feature test macro for C++11
> rvalue references: __cpp_rvalue_reference instead of the correct
> __cpp_rvalue_references.

So this specific point is a gcc problem but libraries targeting supporting what
gcc has done historically will now have 2 names to test the value of.

Quuxplusone commented 8 years ago

Patch for review: https://reviews.llvm.org/D24093

Quuxplusone commented 8 years ago

Committed in r280705.

Quuxplusone commented 8 years ago

(In reply to comment #6)
> Committed in r280705.

Thanks Krzysztof.

Dimitry Andric (dim at FreeBSD.org) has written:

> I merged the upstream fix to projects/clang390-import:
>
> https://svnweb.freebsd.org/changeset/base/305686

So FreeBSD head (current) for 12 will be adopting your changes.

As for my activity:

I'll not have access to powerpc64s/powerpcs for a few weeks yet.

Quuxplusone commented 8 years ago

(In reply to comment #6)
> Committed in r280705.

Did the commit also fix the stack pointer adjustment timing for the post-amble
code?

The wording that I see in the review and commit talks about the "claim" side of
things, which I interpret to be for the pre-amble side of things. If I
interpret the code right that is the side fixed. (I'm not clang/llvm code
literate so I could easily be wrong.)

My original submittal also noted the timing problem existed on the post-amble
side in 3.8.0's code generation:

0x1801b8c <Buf_AddBytes+104>:   lwz     r30,24(r31)
0x1801b90 <Buf_AddBytes+108>:   lwz     r29,20(r31)
0x1801b94 <Buf_AddBytes+112>:   lwz     r28,16(r31)
0x1801b98 <Buf_AddBytes+116>:   lwz     r27,12(r31)
0x1801b9c <Buf_AddBytes+120>:   lwz     r26,8(r31)
0x1801ba0 <Buf_AddBytes+124>:   addi    r1,r1,32   # Stack pointer adjusted first
0x1801ba4 <Buf_AddBytes+128>:   lwz     r0,4(r1)
0x1801ba8 <Buf_AddBytes+132>:   lwz     r31,-4(r1) # Then Frame Pointer load
happens
                                                   # "outside" the new stack range.
0x1801bac <Buf_AddBytes+136>:   mtlr    r0
0x1801bb0 <Buf_AddBytes+140>:   blr

If such code can still be generated there is is a time frame needing a red-zone
to protect stack contents.

Hopefully I'm just wrong and this was fixed too.

Quuxplusone commented 8 years ago

The post-amble has not been fixed.

Quuxplusone commented 8 years ago

The epilogue part of the fix: https://reviews.llvm.org/D24466

Hopefully there is nothing else missing.

Quuxplusone commented 8 years ago

Committed in r282174.