markmi commented 8 years ago


Bugzilla Link	26519
Resolution	FIXED
Resolved on	Nov 12, 2017 11:42
Version	4.0
OS	FreeBSD
Blocks	llvm/llvm-project#26154
CC	@chmeeedalf,@emaste,@kparzysz-quic

Extended Description

Comparing clang 3.8.0 (via FreeBSD's projects/clang380-import svn) generated code for TARGET_ARCH=powerpc (32-bit) to gcc 4.2.1 generated code. . .

clang 3.8.0 based Str_Match preamble (from make):

0x181a4a8 : mflr r0 0x181a4ac <Str_Match+4>: stw r31,-4(r1) # Clang's frame pointer (r31)

saved before stack pointer changed.

0x181a4b0 <Str_Match+8>: stw r0,4(r1) # lr saved before stack pointer changed. 0x181a4b4 <Str_Match+12>: stwu r1,-32(r1) # Stack pointer finally saved and

changed.

0x181a4b8 <Str_Match+16>: mr r31,r1 # r31 is the frame pointer under clang. 0x181a4bc <Str_Match+20>: stw r30,24(r31)

gcc 4.2.1 based Str_Match preamble:

0x1819cb8 : mflr r0 0x1819cbc <Str_Match+4>: stwu r1,-32(r1) # Stack pointer saved and changed first. 0x1819cc0 <Str_Match+8>: stw r31,28(r1) # r31 saved after stack pointer changed. 0x1819cc4 <Str_Match+12>: mr r31,r3 # gcc 4.2.1 does not reserve

r31 for use as a frame pointer.

0x1819cc8 <Str_Match+16>: stw r30,24(r1) 0x1819ccc <Str_Match+20>: stw r0,36(r1) # lr saved after stack pointer changed.

Picking a different example for postamble code, showing just clang 3.8.0's code:

0x1801b8c <Buf_AddBytes+104>: lwz r30,24(r31) 0x1801b90 <Buf_AddBytes+108>: lwz r29,20(r31) 0x1801b94 <Buf_AddBytes+112>: lwz r28,16(r31) 0x1801b98 <Buf_AddBytes+116>: lwz r27,12(r31) 0x1801b9c <Buf_AddBytes+120>: lwz r26,8(r31) 0x1801ba0 <Buf_AddBytes+124>: addi r1,r1,32 # Stack pointer adjusted first 0x1801ba4 <Buf_AddBytes+128>: lwz r0,4(r1) 0x1801ba8 <Buf_AddBytes+132>: lwz r31,-4(r1) # Then Frame Pointer load happens

"outside" the new stack range.

0x1801bac <Buf_AddBytes+136>: mtlr r0 0x1801bb0 <Buf_AddBytes+140>: blr

In other words: clang 3.8.0's generated 32-bit powerpc code is based on there being a safe scratch area below the stack ("below" by memory address). So similar to the 224 byte "red zone" area that 32-bit AIX powerpc and 32-bit Darwin powerpc use.

I'm told by Nathan Whithorn that "the 32-bit ELF ABI does not require any such red zone" and so that clang 3.8.0 is violating the ABI that is supposed to be involved.

I do not have specific document or section references (or web links) to list for the ABI details at this time. I'm just reporting what I'm told by FreeBSD folks.

I used "make" code as the example above because something like "make -j 6 buildworld" uses signal delivery extensively (SIGCHLD) and such a build its gets a SEGV in a make process within the 1st few minutes (on a "Quad core G5 PowerMac" using a FreeBSD powerpc 32-bit installation). The signal delivery is sometimes replacing the value at "-4(r1)" in the above code before it is loaded back into r31 (the clang 3.8.0 framepointer register for powerpc as it is currently generating code). The FreeBSD signal delivery for 32-bit powerpc does not have/use a "red zone" on the smaller-address side of the stack.

kparzysz-quic commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#33070

kparzysz-quic commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#32930

emaste commented 6 years ago

Closing per feedback from Mark.

markmi commented 6 years ago

There are a couple of different issues discussed in this PR so I'm not sure if they are all resolved at this point or not. Mark?

Overall I expect closing 226519 as fixed again is appropriate.

Supporting details:

With multiple problems at the same time making notes indicating what is working vs. what is not can be a mess: when one thing is fixed things can still fail overall.

The particular type of issue 26519 directly applies to is just code generation and its stack handling properties, not other things tied to C++ exceptions working (even though I note failures in thrown C++ exceptions).

I've not found any more evidence of the code generation mishandling the stack or ordering the code in a way that violates the ABI. My testing has been with FreeBSD's clang 5 in head for some time now.

However, at this point I've not investigated the /sbin/init failure that was noted at all to know if code generation was involved. If code generation for stack operations was involved then this could be re-opened if closed. (As has happened before for incomplete stack handling fixes.)

The kernel fails to build for powerpc via clang 5 so I do not get as far as testing /sbin/init based on clang 5 code generation.

emaste commented 6 years ago

There are a couple of different issues discussed in this PR so I'm not sure if they are all resolved at this point or not. Mark?

kparzysz-quic commented 7 years ago

The fix has been committed in master in r303257.

I opened llvm/llvm-bugzilla-archive#33070 to request merging it into 4.0.1.

kparzysz-quic commented 7 years ago

So it looks good so far for what the patch was intended to fix (32-bit powerpc context).

That is great news. Thank you for doing all this work.

markmi commented 7 years ago

I posted a patch for master in https://reviews.llvm.org/D33017, but it doesn't apply cleanly to release_40.

I'm attaching the version for 4.0. Could you try if it works?

I've done a buildworld based on the patch, installed it, booted it, and: built and installed /usr/ports/lang/perl5.24 with it. That includes the internal miniperl build that previous created a miniperl that crashed --and the use of the new miniperl in the later stages of building perl: no crashes and things seem to be operating.

So it looks good so far for what the patch was intended to fix (32-bit powerpc context).

markmi commented 7 years ago

I posted a patch for master in https://reviews.llvm.org/D33017, but it doesn't apply cleanly to release_40.

I'm attaching the version for 4.0. Could you try if it works?

I'll see about updating to such a clang and retrying the port builds with it, for example.

I'll note that Roman Divacky has me trying for clang 4:

svnlite diff /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp

Index: /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp

--- /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (revision 317820) +++ /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (working copy) @@ -1187,7 +1187,7 @@ // For SVR4, don't emit a move for the CR spill slot if we haven't // spilled CRs. if (isSVR4ABI && (PPC::CR2 <= Reg && Reg <= PPC::CR4)

&& !MustSaveCR)
&& (!MustSaveCR && isPPC64)) continue;

// For 64-bit SVR4 when we have spilled CRs, the spill location

for having the CFI information written out for condition code register use for powerpc (later code not shown). He is looking into some of the issues for C++ handling thrown exceptions.

kparzysz-quic commented 7 years ago

Patch for release_40

kparzysz-quic commented 7 years ago

I posted a patch for master in https://reviews.llvm.org/D33017, but it doesn't apply cleanly to release_40.

I'm attaching the version for 4.0. Could you try if it works?

markmi commented 7 years ago

Created attachment 18417 [details] .zip of preprocessed numeric.c from perl5 for comments 24 & 25

The Perl_cast_iv code ends up looking like:

. . . typedef long long IV; typedef unsigned long long UV; . . . typedef double NV; . . .

IV Perl_cast_iv(NV f) { if (f < (2.0 (1 + (((UV)((IV) ((~(UV)0) >> 1))) >> 1)))) return f < (-((IV) ((~(UV)0) >> 1)) - ((3 & -1) == 3)) ? (-((IV) ((~(UV)0) >> 1)) - ((3 & -1) == 3)) : (IV) f; if (f < (4.0 (1 + (((~(UV)0)) >> 2)))) {

return (IV)(UV) f;

} return f > 0 ? (IV)(~(UV)0) : 0 ; }

markmi commented 7 years ago

.zip of preprocessed numeric.c from perl5 for comments 24 & 25 The file in the .zip was produced with:

cd /usr/obj/portswork/usr/ports/lang/perl5.24/work/perl-5.24.1

cpp -c -DPERL_CORE -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_FORTIFY_SOURCE=2 -O2 -pipe -g -fno-strict-aliasing -Wall -Werror=declaration-after-statement -Wextra -Wc++-compat -Wwrite-strings -Wthread-safety -DPIC -fPIC numeric.c > ~/perl_numeric_powerpc.txt

The original compile had cc instead of cpp and did not have the I/O redirection: I edited the line shown in the log file to produce the above line. Luckily clang's cpp ignores irrelevant compiler options, making a correctly matching cpp command easy to produce.

You likely want to rename perl_numeric_powerpc.txt to compile it.

kparzysz-quic commented 7 years ago

Could you attach a preprocessed file?

markmi commented 7 years ago

Committed a fix in r302183.

Bad news: Another code generation error, this time demonstrated in compiling part of perl5. . . (I tried to build a port that indirectly tried build perl5 but perl5's miniperl crashes.)

/usr/obj/portswork/usr/ports/lang/perl5.24/work/perl-5.24.1/numeric.c

has Perl_cast_iv(NV f) for which clang double stores two different things to one address [24(r1)]. Below the => lines are the double store, the second destroying the r30 value that was saved in the first:

Dump of assembler code for function Perl_cast_iv: 0x0196a114 <+0>: mflr r0 0x0196a118 <+4>: stw r0,4(r1) 0x0196a11c <+8>: stwu r1,-32(r1) 0x0196a120 <+12>: stw r31,28(r1) => 0x0196a124 <+16>: stw r30,24(r1) 0x0196a128 <+20>: mr r31,r1 0x0196a12c <+24>: mfcr r12 => 0x0196a130 <+28>: stw r12,24(r31)

Note: r31 == r1 for that second "=>" line.

The return code sequence has a similar problem: two loads from the same address.

Note: r31 == r1 here too.

=> 0x0196a1bc <+168>: lwz r12,24(r31) 0x0196a1c0 <+172>: lwz r0,36(r1) 0x0196a1c4 <+176>: lwz r31,28(r1) => 0x0196a1c8 <+180>: lwz r30,24(r1) 0x0196a1cc <+184>: mtcrf 32,r12 0x0196a1d0 <+188>: addi r1,r1,32 0x0196a1d4 <+192>: mtlr r0 0x0196a1d8 <+196>: blr

The Perl_cast_iv source code looks like:

IV Perl_cast_iv(NV f) { if (f < IV_MAX_P1) return f < IV_MIN ? IV_MIN : (IV) f; if (f < UV_MAX_P1) {

if CASTFLAGS & 2

/* For future flexibility allowing for sizeof(UV) >= sizeof(IV)  */
if (f < UV_MAX_P1_HALF)
  return (IV)(UV) f;
f -= UV_MAX_P1_HALF;
return (IV)(((UV) f) | (1 + (UV_MAX >> 1)));

else

return (IV)(UV) f;

endif

} return f > 0 ? (IV)UV_MAX : 0 / NaN /; }

markmi commented 7 years ago

Committed a fix in r302183.

buildworld buildkernel makes extensive use of signals and its failure is how I discovered the original stack handling problems. I used to have to patch in so-called "red zone" handling to avoid the issue.

No more: a running a kernel that was built without a "red zone" and running a world based on clang now allows buildworld buildkernel to complete just fine.

Not needing a "red zone" is a great improvement.

markmi commented 7 years ago

Committed a fix in r302183.

Thanks. FreeBSD's head merged it in as -r317810 and I've built and booted -r317820 based on:

A) buildworld built via the updated system-clang 4 B) buildkernel built via gcc 4.2.1

It appears for (A) that even C++ exceptions work to some extent (possibly fully) and various previously broken commands now work (no stack problem).

I got testing clang vs. gcc 4.2.1 confused: only the gcc 4.2.1 buildworld ends up with a c++ compiler where handling thrown C++ exceptions works. When clang++ is used handling thrown C++ exceptions fails in the program that results, like before.

(powerpc64 has the same sort of C++ exception handling issues.)

But the stack handling fix is still a great improvement for powerpc: thanks again.

markmi commented 7 years ago

Committed a fix in r302183.

Thanks. FreeBSD's head merged it in as -r317810 and I've built and booted -r317820 based on:

A) buildworld built via the updated system-clang 4 B) buildkernel built via gcc 4.2.1

It appears for (A) that even C++ exceptions work to some extent (possibly fully) and various previously broken commands now work (no stack problem).

Thanks much. Big improvement.

stable/11 will probably merge the change in a few days.

Booting a kernel from a system-clang 4 based buildkernel fails with "exec /sbin/init: error 13". This is not new. But it does get to the point of attempting /sbin/init .

(Note: powerpc64 could use the 2 line patch listed in llvm bugzilla 31716's Comment 1 to get to a somewhat similar state of usability on FreeBSD. As stands I use a personal patch for the issue but that is just me.)

kparzysz-quic commented 7 years ago

Opened llvm/llvm-bugzilla-archive#32930 to request merging this into 4.0.1.

kparzysz-quic commented 7 years ago

Committed a fix in r302183.

kparzysz-quic commented 7 years ago

I take it back. I just noticed that it's set to GOT at the entry to the function, not before it.

kparzysz-quic commented 7 years ago

Here's what I tried:

--- float-r30.c --- double __floatdidf(long long int a) { static const double twop52 = 4503599627370496.0; static const double twop32 = 4294967296.0;

union { long long int x; double d; } low = { .d = twop52 };

const double high = (int)(a >> 32) * twop32;
low.x |= a & 0x00000000ffffffffL;

const double result = (high - twop52) + low.d;
return result;

}

clang -target powerpc -S -O2 float-r30.c -fpic

Output:

--- float-r30.s --- .text .file "float-r30.c" .section .rodata.cst4,"aM",@progbits,4 .p2align 2 .LCPI0_0: .long 1501560836 # float 4.50360177E+15 .LCPI0_1: .long 1333788672 # float 4.2949673E+9 .LCPI0_2: .long 3649044480 # float -4.50359963E+15 .text .globl floatdidf .p2align 2 .type floatdidf,@function floatdidf: # @floatdidf .Lfunc_begin0:

BB#0: # %entry

    mflr 0
    stw 0, 4(1)
    stwu 1, -32(1)
    stw 31, 28(1)
    stw 30, 24(1)
    bl _GLOBAL_OFFSET_TABLE_@local-4
    mr 31, 1
    lis 5, 17200
    xoris 3, 3, 32768
    mflr 30
    stw 5, 8(31)
    stw 3, 12(31)
    lwz 3, .LCPI0_1@GOT(30)
    lwz 6, .LCPI0_0@GOT(30)
    lfd 1, 8(31)
    stw 4, 20(31)
    stw 5, 16(31)
    lfd 13, 16(31)
    lwz 0, 36(1)
    lwz 31, 28(1)
    lwz 30, 24(1)
    lfs 2, 0(3)
    lwz 3, .LCPI0_2@GOT(30)
    lfs 0, 0(6)
    lfs 12, 0(3)
    fsub 0, 1, 0
    fmul 0, 0, 2
    fadd 0, 0, 12
    fadd 1, 0, 13
    addi 1, 1, 32
    mtlr 0
    blr

.Lfunc_end0: .size __floatdidf, .Lfunc_end0-.Lfunc_begin0

    .ident  "clang version 5.0.0 (http://llvm.org/git/clang.git fcb52251e4b32f8f4ab0a4d434c3f9c5a5ffe008) (http://llvm.org/git/llvm.git 624e695cd90a7b570df26159f225f79fc8a86976)"
    .section        ".note.GNU-stack","",@progbits

The r30 is used as a PIC base pointer, and it is restored before it's used. This is actually correct.

markmi commented 7 years ago

FYI: FreeBSD has merged clang/llvm 4.0 into stable/11 (-r316423) in addition to head (12) some time ago. release/11.1.0 will be clang/llvm 4.0 based.

TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc still does not handle thrown exceptions in this clang version.

TARGET_ARCH=powerpc still has the R30 vs. tack handling vs. floating point code problem: restored R30 and then generates floating point cleanup code that expects R30 to not have changed yet.

(That last is why this was reopened, although at the time the clang was an earlier version.)

markmi commented 7 years ago

My interpretation of this is that the "lwz r30,24(r1)" is simply out of sequence and should be later in the sequence:

0x0181b01c <floatdidf+80>: lwz r30,24(r1) <<<=== not here 0x0181b020 <floatdidf+84>: lfs f2,0(r3) 0x0181b024 <floatdidf+88>: lwz r3,-12(r30) 0x0181b028 <floatdidf+92>: lfs f0,0(r6) 0x0181b02c <floatdidf+96>: lfs f12,0(r3) 0x0181b030 <floatdidf+100>: fsub f0,f1,f0 0x0181b034 <floatdidf+104>: fmul f0,f0,f2 0x0181b038 <floatdidf+108>: fadd f0,f0,f12 0x0181b03c <floatdidf+112>: fadd f1,f13,f0 lwz r30,24(r1) <<<<<<<<<<=============== instead here 0x0181b040 <floatdidf+116>: addi r1,r1,32 0x0181b044 <floatdidf+120>: mtlr r0 0x0181b048 <floatdidf+124>: blr

(addresses not adjusted for the change in the above)

So that r30 would be restored between the main activity being finished and r1 being adjusted (popping the stack frame). Of course any place after the last local r30 use [after -12(r30)] up to where I show it would also work.

I reported this issue as a comment on 26519 because I'm guessing that the ordering problem is tied to trying to trying to avoid the stack-handling ABI violations for FreeBSD: another incompleteness in the coverage of the stack handling. If it is not that then a separate submittal would likely be appropriate.

Since I'm allowed to I'm going to change the status to reopened so that it does not look to have been fully fixed. Let me know if this is inappropriate since I would not be the one fixing it.

markmi commented 7 years ago

For reference this is now in FreeBSD as https://svnweb.freebsd.org/changeset/base/309147

Unfortunately there are problems handling use of r30, at least when floating point is involved.

The used code from the compliler_rt __floatdidf seems to be just wrong in part of its use of r30. The below initially uses the "df -m" example failure and its code.

At first is saves r30 on the stack as part of the function preamble:

Dump of assembler code for function floatdidf: 0x018029b8 <floatdidf+0>: mflr r0 0x018029bc <floatdidf+4>: stw r0,4(r1) 0x018029c0 <floatdidf+8>: stwu r1,-32(r1) 0x018029c4 <floatdidf+12>: stw r31,28(r1) 0x018029c8 <floatdidf+16>: stw r30,24(r1)

Later it uses r30 for other things.

Then it restores it from the stack:

0x01802a08 <__floatdidf+80>: lwz r30,24(r1)

What it does with r30 next effectively makes r30 a parameter to the routine and not just saved/restored:

0x01802a10 <__floatdidf+88>: lwz r3,-8(r30)

In the above the r30 from the caller's context is being used as the base for accessing memory and putting that memory content in r3.

What it does next with r3 is one of the points where SIGSEGV is happening: For "df -m" r3 ends up being 0 and the following fails.

0x01802a18 <__floatdidf+96>: lfs f12,0(r3)

In the fsck_ufs example it is the same sort of thing but r30 happens to be initially 0 so it fails at an earlier stage.

Again there is the preamble that saves r30:

0x0181afcc <floatdidf+0>: mflr r0 0x0181afd0 <floatdidf+4>: stw r0,4(r1) 0x0181afd4 <floatdidf+8>: stwu r1,-32(r1) 0x0181afd8 <floatdidf+12>: stw r31,28(r1) 0x0181afdc <__floatdidf+16>: stw r30,24(r1)

Later it uses r30 for other things.

Then it restores it from the stack:

0x0181b01c <__floatdidf+80>: lwz r30,24(r1)

What it does with r30 next effectively makes r30 a parameter to the routine and not just saved/restored:

0x0181b024 <__floatdidf+88>: lwz r3,-12(r30)

Again: In the above the r30 from the caller's context is being used as the base for accessing memory and putting that memory content in r3.

But this time it turns out that r30 is 0 and the above also fails.

If the code had gotten that far it would have done the same thing with r3 that "df -m" did in its failure:

0x0181b02c <__floatdidf+96>: lfs f12,0(r3)

For reference compiler-rt/lib/builtins/floatdidf.c has:

COMPILER_RT_ABI double __floatdidf(di_int a) { static const double twop52 = 4503599627370496.0; // 0x1.0p52 static const double twop32 = 4294967296.0; // 0x1.0p32

union { int64_t x; double d; } low = { .d = twop52 };

const double high = (int32_t)(a >> 32) * twop32;
low.x |= a & INT64_C(0x00000000ffffffff);

const double result = (high - twop52) + low.d;
return result;

}

and the matching assembler code for the fsck_ufs example is (from gdb):

0x0181afcc <floatdidf+0>: mflr r0 0x0181afd0 <floatdidf+4>: stw r0,4(r1) 0x0181afd4 <floatdidf+8>: stwu r1,-32(r1) 0x0181afd8 <floatdidf+12>: stw r31,28(r1) 0x0181afdc <floatdidf+16>: stw r30,24(r1) 0x0181afe0 <floatdidf+20>: bl 0x182e96c <.got+20> 0x0181afe4 <floatdidf+24>: mr r31,r1 0x0181afe8 <floatdidf+28>: xoris r3,r3,32768 0x0181afec <floatdidf+32>: lis r5,17200 0x0181aff0 <floatdidf+36>: mflr r30 0x0181aff4 <floatdidf+40>: stw r3,12(r31) 0x0181aff8 <floatdidf+44>: stw r5,8(r31) 0x0181affc <floatdidf+48>: lwz r3,-16(r30) 0x0181b000 <floatdidf+52>: lwz r6,-20(r30) 0x0181b004 <floatdidf+56>: lfd f1,8(r31) 0x0181b008 <floatdidf+60>: stw r4,20(r31) 0x0181b00c <floatdidf+64>: stw r5,16(r31) 0x0181b010 <floatdidf+68>: lfd f13,16(r31) 0x0181b014 <floatdidf+72>: lwz r0,36(r1) 0x0181b018 <floatdidf+76>: lwz r31,28(r1) 0x0181b01c <floatdidf+80>: lwz r30,24(r1) 0x0181b020 <floatdidf+84>: lfs f2,0(r3) 0x0181b024 <floatdidf+88>: lwz r3,-12(r30) 0x0181b028 <floatdidf+92>: lfs f0,0(r6) 0x0181b02c <floatdidf+96>: lfs f12,0(r3) 0x0181b030 <floatdidf+100>: fsub f0,f1,f0 0x0181b034 <floatdidf+104>: fmul f0,f0,f2 0x0181b038 <floatdidf+108>: fadd f0,f0,f12 0x0181b03c <floatdidf+112>: fadd f1,f13,f0 0x0181b040 <floatdidf+116>: addi r1,r1,32 0x0181b044 <floatdidf+120>: mtlr r0 0x0181b048 <floatdidf+124>: blr

emaste commented 7 years ago

For reference this is now in FreeBSD as https://svnweb.freebsd.org/changeset/base/309147

markmi commented 7 years ago

Committed in r282174.

projects/clang390-import has been merged to head for FreeBSD --and is now scheduled for an MFC from there into stable/11 in a month. It does not yet have this fix.

But there is a chance for later 11.x-RELEASE's to support powerpc64 and/or powerpc via clang, not just 12.0-RELEASE. (I'm not implying in a month specifically, just that stable/11 might progress as llvm does as long as the fixes backport into FreeBSD's 3.9.0 reasonably --and that 11.x would pick up such changes from stable/11 .)

I'm still hoping that llvm 26519's and 26970's fixes will be applied to clang 3.9.0 in FreeBSD's head soon. That would enable some significant testing. I could try buildworld for powerpc64 and I could remove my "red zone" signal delivery hack for powerpc to see if it is no longer needed [lack of ABI violations].

[I want to be able to have good runs/results of the kyua tests before I try to deal with testing buildkernel. This requires signal handling and C++ exception handling to be working well even if the kernel does not require all of that.]

kparzysz-quic commented 8 years ago

Committed in r282174.

kparzysz-quic commented 8 years ago

The epilogue part of the fix: https://reviews.llvm.org/D24466

Hopefully there is nothing else missing.

kparzysz-quic commented 8 years ago

The post-amble has not been fixed.

markmi commented 8 years ago

Committed in r280705.

Did the commit also fix the stack pointer adjustment timing for the post-amble code?

The wording that I see in the review and commit talks about the "claim" side of things, which I interpret to be for the pre-amble side of things. If I interpret the code right that is the side fixed. (I'm not clang/llvm code literate so I could easily be wrong.)

My original submittal also noted the timing problem existed on the post-amble side in 3.8.0's code generation:

0x1801b8c <Buf_AddBytes+104>: lwz r30,24(r31) 0x1801b90 <Buf_AddBytes+108>: lwz r29,20(r31) 0x1801b94 <Buf_AddBytes+112>: lwz r28,16(r31) 0x1801b98 <Buf_AddBytes+116>: lwz r27,12(r31) 0x1801b9c <Buf_AddBytes+120>: lwz r26,8(r31) 0x1801ba0 <Buf_AddBytes+124>: addi r1,r1,32 # Stack pointer adjusted first 0x1801ba4 <Buf_AddBytes+128>: lwz r0,4(r1) 0x1801ba8 <Buf_AddBytes+132>: lwz r31,-4(r1) # Then Frame Pointer load happens

"outside" the new stack range.

0x1801bac <Buf_AddBytes+136>: mtlr r0 0x1801bb0 <Buf_AddBytes+140>: blr

If such code can still be generated there is is a time frame needing a red-zone to protect stack contents.

Hopefully I'm just wrong and this was fixed too.

markmi commented 8 years ago

Committed in r280705.

Thanks Krzysztof.

Dimitry Andric (dim at FreeBSD.org) has written:

I merged the upstream fix to projects/clang390-import:

https://svnweb.freebsd.org/changeset/base/305686

So FreeBSD head (current) for 12 will be adopting your changes.

As for my activity:

I'll not have access to powerpc64s/powerpcs for a few weeks yet.

kparzysz-quic commented 8 years ago

Committed in r280705.

kparzysz-quic commented 8 years ago

Patch for review: https://reviews.llvm.org/D24093

markmi commented 8 years ago

Just a note on cpp_rvalue_reference vs. cpp_rvalue_references :

Side notes:

powerpc64-portbld-freebsd11.0-g++ also reports:

define __CMODEL_MEDIUM__ 1

. . .

define __cpp_rvalue_reference 200610

But clang++ 3.8.0 reports nothing analogous to the first and:

define __cpp_rvalue_references 200610

(gcc and clang have spelling differences here.)

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00175.html says:

as #71214 points out gcc uses a wrong feature test macro for C++11 rvalue references: cpp_rvalue_reference instead of the correct cpp_rvalue_references.

So this specific point is a gcc problem but libraries targeting supporting what gcc has done historically will now have 2 names to test the value of.

markmi commented 8 years ago

It appears that I originally forgot to set the Component for my submittal. Looking around I estimate that LLVM Codegen is the closest category: PPCFrameLowering.cpp is not under tools/clang/ at all so C++ would seem to be over specific.

markmi commented 8 years ago

"FreeBSD's" Justin Hibbits added a note in the FreeBSD bug entry (206990) that I repeat here:

There is no provision in the ABI for a redzone in 32-bit powerpc. LLVM is broken for 32-bit PowerPC regarding this, and there are comments in the source code to this regard, to the effect:

(PPCFrameLowering.cpp): // FIXME: On PPC32 SVR4, we must not spill before claiming the stackframe.

If a signal interrupts the thread at the precise wrong time (when creating the stack frame, but before adjusting %r1), Bad Things will happen.

markmi commented 8 years ago

FYI, additional ABI issue possibility:

On FreeBSD 11.0-CURRENT (well, projects/clang380-import) when I try "clang++ -std=c++11 -dM -E just_main.cpp" I get. . .

define __BIGGEST_ALIGNMENT__ 8

. . .

define __NATURAL_ALIGNMENT__ 1

on both TARGET_ARCH=powerpc and TARGET_ARCH=powerpc64 contexts.

But for the special port devel/powerpc64-gcc used for modern cross compiles of FreeBSD ( /usr/local/bin/powerpc64-portbld-freebsd11.0-g++ -std=c++11 -dM -E just_main.cpp ) I get:

define __BIGGEST_ALIGNMENT__ 16

(Natural is not referenced. There is no such special port for 32-bit powerpc to see its output.)

So there may be an ABI alignment mismatch someplace as well unless the ABI has some optional-status alignment rules.

Without a ABI reference document I'm unsure which __BIGGEST_ALIGNMENT__ would be correct for the FreeBSD powerpc ABI (if either one is). (Similarly for powerpc64.)

Side notes:

powerpc64-portbld-freebsd11.0-g++ also reports:

define __CMODEL_MEDIUM__ 1

. . .

define __cpp_rvalue_reference 200610

But clang++ 3.8.0 reports nothing analogous to the first and:

define __cpp_rvalue_references 200610

(gcc and clang have spelling differences here.)

llvm / llvm-project

Clang 4.0.0's "Target: powerpc-unknown-freebsd11.0" code generation is violating the SVR4 ABI (SEGV can result) #26893

Extended Description

saved before stack pointer changed.

changed.

r31 for use as a frame pointer.

"outside" the new stack range.

svnlite diff /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp

Index: /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp

if CASTFLAGS & 2

else

endif

}

BB#0: # %entry

"outside" the new stack range.

define __CMODEL_MEDIUM__ 1

define __cpp_rvalue_reference 200610

define __cpp_rvalue_references 200610

define __BIGGEST_ALIGNMENT__ 8

define __NATURAL_ALIGNMENT__ 1

define __BIGGEST_ALIGNMENT__ 16

define __CMODEL_MEDIUM__ 1

define __cpp_rvalue_reference 200610

define __cpp_rvalue_references 200610