llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.54k stars 11.8k forks source link

Clang miscompiles libgcc's unwind-dw2.c #8913

Open EdSchouten opened 13 years ago

EdSchouten commented 13 years ago
Bugzilla Link 8541
Version 5.0
OS FreeBSD
CC @asl,@davidchisnall,@DimitryAndric,@dschuff,@emaste,@markmi,@stoklund,@yuanfang-chen

Extended Description

FreeBSD HEAD is currently shipped with this version of clang:

$ clang --version FreeBSD clang version 2.8 (tags/RELEASE_28 115870) 20101007

In this thread, it is reported that OpenOffice crashes when libgcc_s.so.1 is built with this version of Clang:

http://lists.freebsd.org/pipermail/freebsd-current/2010-November/020843.html

It turns out that this is due to some kind of miscompilation of unwind-dw2.c. This is our version of unwind-dw2.c:

http://svn.freebsd.org/viewvc/base/head/contrib/gcc/unwind-dw2.c?view=markup

Although I realize this bug report may be too incomplete to be useful, but I'd rather stick it here in Bugzilla to prevent it from getting lost.

markmi commented 4 years ago

See the comments in llvm bugzilla 26844

See llvm bugzilla 26844's updated comments for a clang 5 based status and some related information.

[Note: FreeBSD head (13) now uses clang 9.0.1 and lld for powerpc64 and clang for 32-bit powerpc. This involves switching to llvm's libunwind.]

Bug 26844 has been closed as invalid for a libunwind context, which FreeBSD is switching to. (It is not trying to address if clang++ should support other unwind libraries.)

However, it is noted that the technical detail for the older unwind handling context still applies: clang does not write out cfi information describing the scratch registers r3-r6 on, say, powerpc64, but the older unwind library depends on such being there and SIGSEGV happens when the descriptions are not there and an exception is thrown.

Basically, clang++ parses, but ignores, the builtin that leads g++ to write out the scratch register descriptions, at least for powerpc64 and 32-bit powerpc. See bug 26844 for details. The lack of handling of the built-in was figured out fairly late in the comment sequence.

markmi commented 7 years ago

See llvm bugzilla 26844's updated comments for a clang 5 based status and some related information.

markmi commented 8 years ago

[I was interrupted but now I'm back and can complete the comments.]

For powerpc/powerpc64: beyond Bug 26856 there is also bug 26844: a report that the Itanium C++ exception ABI's 4 scratch registers are not set up (nor is the condition register save/restore) for _Unwind_RaiseException.

The scratch register issue leads to SEGV in:

678 / For targets with pointers smaller than the word size, we must extend the 679 pointer, and this extension is target dependent. / 680 _Unwind_SetGR (context, builtin_eh_return_data_regno (0), 681 builtin_extend_pointer (ue_header)); 682 _Unwind_SetGR (context, __builtin_eh_return_data_regno (1), 683 handler_switch_value); 684 _Unwind_SetIP (context, landing_pad);

I will not list the dwarfdump -v -v -F output extraction for the FreeBSD clang 3.8.0 based _Unwind_RaiseException here. It is in the original report.

There is also a report about __builtin_dwarf_cfa () identifying a different frame boundary than gcc/g++ does. But this gets outside the range of the prolog/epilog code that 8541 ended up focused on.

markmi commented 8 years ago

Bug 26856 "clang 3.8.0/powerpc/powerpc64's _Unwind_RaiseException code generation has messed up r31 (frame pointer) save/restore code (SEGV's can result)" is a report of problems in this area of what and how the save/restores work but for powerpc/powerpc64.

Using powerpc as an example of the "double store of different values" problem for r31 (the frame pointer):

Dump of assembler code for function _Unwind_RaiseException: 0x41b2ab80 <+0>: mflr r0 0x41b2ab84 <+4>: stw r31,-148(r1) 0x41b2ab88 <+8>: stw r30,-152(r1) 0x41b2ab8c <+12>: stw r0,4(r1) 0x41b2ab90 <+16>: stwu r1,-2992(r1) 0x41b2ab94 <+20>: mr r31,r1 . . . 0x41b2abe0 <+96>: stw r31,2844(r31) (which replaces the earlier save of the old Frame pointer R31 value with a copy of r1's current value. Note the offset relationships with the r1 adjustment: -2992+2844=-148) . . . 0x41b2add0 <+592>: lwz r31,2844(r31) (This restores the r1 value that resulted from the "stwu r1,-2992(r1)" into R31.) . . . 0x41b2ae30 <+688>: lwz r31,-148(r1) (This restores the r1 value that resulted from the "stwu r1,-2992(r1)" into R31.) . . .

The wrong r31 value is present when _Unwind_RaiseException returns.

But before that while _Unwind_RaiseException is active the C++ exception handling infrastructure has been given bad r31 information for around _Unwind_RaiseException's frame.

1ba3d143-a64b-4671-82b2-0b31cfb91709 commented 11 years ago

but it seems like in functions that may return via builtin_eh_return, some of the registers (e.g. eax/edx) must be restored in epilogues that actually do return via builtin_eh_return, but not in epilogues with normal returns.

That's exactly right.

dschuff commented 11 years ago

Check out the discussion in this recent thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174424.html

The remaining bit is more subtle and my understanding on it isn't quite 100% but it seems like in functions that may return via builtin_eh_return, some of the registers (e.g. eax/edx) must be restored in epilogues that actually do return via builtin_eh_return, but not in epilogues with normal returns.

DimitryAndric commented 11 years ago

Ah, I see what still goes wrong: the case of a normal return. For example, _Unwind_ForcedUnwind() ends like this:

[...] code = _Unwind_ForcedUnwind_Phase2 (exc, &cur_context); if (code != _URC_INSTALL_CONTEXT) return code;

uw_install_context (&this_context, &cur_context); }

where uw_install_context is a macro that calls __builtin_eh_return(). The resulting assembly after r183984 is:

    callq   _Unwind_ForcedUnwind_Phase2
    cmpl    $7, %eax
    je      .LBB17_2

BB#1: # %if.then

    addq    $488, %rsp              # imm = 0x1E8
    popq    %rax
    popq    %rdx
    popq    %rbx
    popq    %r12
    popq    %r13
    popq    %r14
    popq    %r15
    popq    %rbp
    ret

.LBB17_2: # %do.body1 leaq -296(%rbp), %rdi leaq -544(%rbp), %rsi callq uw_install_context_1 movq -392(%rbp), %rcx movq %rcx, 8(%rbp,%rax) leaq 8(%rbp,%rax), %rcx addq $488, %rsp # imm = 0x1E8 popq %rax popq %rdx popq %rbx popq %r12 popq %r13 popq %r14 popq %r15 popq %rbp movq %rcx, %rsp ret # eh_return, addr: %rcx

E.g. in the non-EH return case, the return value of _Unwind_ForcedUnwind_Phase2() is destroyed because %rax is popped.

In contrast, gcc produces:

[...] call _Unwind_ForcedUnwind_Phase2 cmpl $7, %eax je .L612 movq -40(%rbp), %rbx movq -32(%rbp), %r12 movq -24(%rbp), %r13 movq -16(%rbp), %r14 movq -8(%rbp), %r15 leave ret .L612: movq -584(%rbp), %rsi movq -576(%rbp), %rdi call uw_install_context_1 movq %rax, %rcx movq -392(%rbp), %rax movq %rax, 8(%rbp,%rcx) leaq 8(%rbp,%rcx), %rcx movq -56(%rbp), %rax movq -48(%rbp), %rdx movq -40(%rbp), %rbx movq -32(%rbp), %r12 movq -24(%rbp), %r13 movq -16(%rbp), %r14 movq -8(%rbp), %r15 movq 0(%rbp), %rbp movq %rcx, %rsp ret

E.g. it only pops rbx, and r12-r15 in the normal return case.

DimitryAndric commented 11 years ago

Ok, as far as I can determine, r183984 seems to work very nicely. For example, before applying this rev, libgcc's Unwind_Resume() is output on amd64 as:

_Unwind_Resume: # @​_Unwind_Resume .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp pushq %r14 pushq %rbx pushq %rdx pushq %rax subq $480, %rsp # imm = 0x1E0 .cfi_offset %rax, -48 .cfi_offset %rdx, -40 .cfi_offset %rbx, -32 .cfi_offset %r14, -24 [...] addq $480, %rsp # imm = 0x1E0 popq %rax popq %rdx popq %rbx popq %r14 popq %rbp movq %rcx, %rsp ret # eh_return, addr: %rcx

after applying r183984:

_Unwind_Resume: # @​_Unwind_Resume .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp pushq %r15 pushq %r14 pushq %r13 pushq %r12 pushq %rbx pushq %rdx pushq %rax subq $488, %rsp # imm = 0x1E8 .cfi_offset %rax, -72 .cfi_offset %rdx, -64 .cfi_offset %rbx, -56 .cfi_offset %r12, -48 .cfi_offset %r13, -40 .cfi_offset %r14, -32 .cfi_offset %r15, -24 [...]

    addq    $488, %rsp              # imm = 0x1E8
    popq    %rax
    popq    %rdx
    popq    %rbx
    popq    %r12
    popq    %r13
    popq    %r14
    popq    %r15
    popq    %rbp
    movq    %rcx, %rsp
    ret                             # eh_return, addr: %rcx

Similar result on i386 (where I compiled with -march=corei7, so less regular registers were used), original:

_Unwind_Resume: # @​_Unwind_Resume .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset %ebp, -8 movl %esp, %ebp .cfi_def_cfa_register %ebp pushl %ebx pushl %edx pushl %eax andl $-16, %esp subl $288, %esp # imm = 0x120 .cfi_offset %eax, -20 .cfi_offset %edx, -16 .cfi_offset %ebx, -12 [...] leal -12(%ebp), %esp popl %eax popl %edx popl %ebx popl %ebp movl %ecx, %esp ret # eh_return, addr: %ecx

and after:

_Unwind_Resume: # @​_Unwind_Resume .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset %ebp, -8 movl %esp, %ebp .cfi_def_cfa_register %ebp pushl %ebx pushl %edi pushl %esi pushl %edx pushl %eax andl $-16, %esp subl $288, %esp # imm = 0x120 .cfi_offset %eax, -28 .cfi_offset %edx, -24 .cfi_offset %esi, -20 .cfi_offset %edi, -16 .cfi_offset %ebx, -12 [...]

    leal    -20(%ebp), %esp
    popl    %eax
    popl    %edx
    popl    %esi
    popl    %edi
    popl    %ebx
    popl    %ebp
    movl    %ecx, %esp
    ret                             # eh_return, addr: %ecx

Derek, since you noted in your r183984 commit message "Goes part of the way toward fixing llvm/llvm-bugzilla-archive#8541 ", what other parts are now still needed?

DimitryAndric commented 11 years ago

If builtin_eh_return() is just supposed to initiate the return, and builtin_unwind_init() sets up the saving and restoring, as mentioned in comment 20, then this commit should be enough. I will try it out ASAP and report back the results!

dschuff commented 11 years ago

r183984 fixes PEI::calculateCalleeSavedRegisters to save/restore all registers in functions which call builtin_unwind_init(). I haven't looked at builtin_eh_return() though.

llvmbot commented 11 years ago

Bug llvm/llvm-bugzilla-archive#16298 has been marked as a duplicate of this bug.

llvmbot commented 11 years ago

If I understand this correctly, a function containing a builtin_eh_return intrinsic must save and restore all callee- saved registers, not just the ones that were clobbered in the function.

It's actually builtin_unwind_init() that, in GCC, is supposed to cause the function to save and restore all callee-saved registers. builtin_eh_return() by itself doesn't do that.

$ cat test.c void foo() { __builtin_unwind_init(); } $ gcc test.c -S -o - ... popq %rbx .cfi_offset 3, -56 .cfi_offset 12, -48 .cfi_offset 13, -40 .cfi_offset 14, -32 .cfi_offset 15, -24 popq %r12 popq %r13 popq %r14 popq %r15 popq %rbp .cfi_def_cfa 7, 8 ret ...

I just reported the same miscompilation of libgcc_eh in bug 16298.

asl commented 11 years ago

It sounds like PEI::calculateCalleeSavedRegisters should check MMI.callsEHReturn, and skip the F.getRegInfo().isPhysRegUsed(Reg) check in those functions. Yes, something like this, but there are a bit more details. I'm working on this now.

1ba3d143-a64b-4671-82b2-0b31cfb91709 commented 11 years ago

We should already be treating the eh_return intrinsics as return instructions in the backend, actually that is what caused the crash in pr14750.

If I understand this correctly, a function containing a builtin_eh_return intrinsic must save and restore all callee- saved registers, not just the ones that were clobbered in the function. It must also restore eax and edx, something Anton has already taken care of.

It sounds like PEI::calculateCalleeSavedRegisters should check MMI.callsEHReturn, and skip the F.getRegInfo().isPhysRegUsed(Reg) check in those functions.

asl commented 11 years ago

Wow, thanks for nice reduction, David! Then you definitely should blame me since I added this builtin at time of 2.1... Yes, probably we need to handle return-like builtins separately. This is already done for tail call returns, so should be handled for eh_return as well. I will take a look.

davidchisnall commented 11 years ago

On further investigation, it appears that the problem is that __builtin_eh_return is incorrectly compiled (not surprising, as it's a completely undocumented GCC builtin). This is expanded to the LLVM intrinsic llvm.eh.return, which does not do quite the right thing.

The issue appears to be that the call to this builtin is really a return, and so all clobbered callee-save registers must be restored before the call.

We have worked around this by adding this before the call:

asm volatile(" " : : : "r15", "r14", "r13", "r12", "rbx", "rdx", "rax");

This forces clang to restore these registers before the call, but it's an ugly hack. The correct solution would be for the back end to model @​llvm.eh.return as a return and insert the correct function epilog.

Speculation:

This builtin is used after a call to the function that installs the context, however the function that installs he context is clobbering the callee-save registers. As such, we must restore the callee-save registers even if we don't clobber them in any function where this is called.

GCC has some special-case logic that saves and restores all callee-save registers in the prolog and restores them in the epilog in any function where this builtin is used.

davidchisnall commented 11 years ago

This has been reported as still broken in LLVM 3.2. dim spent some time bisecting and the commit that made it break in between 3.1 and 3.2 was a fairly small change to one of the loop passes and doesn't look like it had anything to do with the actual problem, which makes me suspect that it's something (possibly in the unwind tables?) that is exposed by a particular code layout. Some other optimisation change likely hid it some time between 2.8 and 3.1. The other suggestion is that clang incorrectly implements one of the many undocumented builtins that this file uses, in such a way that does not break all of the time.

More information in this thread:

http://lists.freebsd.org/pipermail/freebsd-current/2012-December/038766.html

llvmbot commented 13 years ago

Reported as working with recent clang.

llvmbot commented 13 years ago

Just to inform that after upgrade FreeBSD (that has imported a new clang) the problem is gone, everything is working fine now. This bug can be closed.

llvmbot commented 13 years ago

can you provide the .i produced by clang and the "clang -cc1" command line? Also, can you try both with and without the integrated assembler?

llvmbot commented 13 years ago

Given the pain, I guess it is better to analyse the problem directly and not try to use llvm-gcc.

llvmbot commented 13 years ago

Now I fugured out we have a explicit --disable-shared inside llvm-gcc4 freebsd port when $ARCH is amd64, i changed it to --enable-shared to see what happens discover llvm-gcc doesn't build on freebsd amd64 with this param. I got a similar error:

mv 'libgcc/./tmp-libgcc.map' libgcc/./libgcc.map /home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/./gcc/xgcc -B/home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/./gcc/ -B/usr/local/x86_64-portbld-freebsd9.0/bin/ -B/usr/local/x86_64-portbld-freebsd9.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd9.0/include -isystem /usr/local/x86_64-portbld-freebsd9.0/sys-include -O2 -O2 -O2 -pipe -fPIC -fno-strict-aliasing -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -pthread -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -DGCC_FLOAT_NOT_NEEDED -shared -nodefaultlibs -Wl,--soname=libgcc_s.so.1 -Wl,--version-script=libgcc/./libgcc.map -o ./libgcc_s.so.1.tmp libgcc/./_muldi3_s.o libgcc/./_negdi2_s.o libgcc/./_lshrdi3_s.o libgcc/./_ashldi3_s.o libgcc/./_ashrdi3_s.o libgcc/./_cmpdi2_s.o libgcc/./_ucmpdi2_s.o libgcc/./_clear_cache_s.o libgcc/./_enable_execute_stack_s.o libgcc/./_trampoline_s.o libgcc/./main_s.o libgcc/./_absvsi2_s.o libgcc/./_absvdi2_s.o libgcc/./_addvsi3_s.o libgcc/./_addvdi3_s.o libgcc/./_subvsi3_s.o libgcc/./_subvdi3_s.o libgcc/./_mulvsi3_s.o libgcc/./_mulvdi3_s.o libgcc/./_negvsi2_s.o libgcc/./_negvdi2_s.o libgcc/./_ctors_s.o libgcc/./_ffssi2_s.o libgcc/./_ffsdi2_s.o libgcc/./_clz_s.o libgcc/./_clzsi2_s.o libgcc/./_clzdi2_s.o libgcc/./_ctzsi2_s.o libgcc/./_ctzdi2_s.o libgcc/./_popcount_tab_s.o libgcc/./_popcountsi2_s.o libgcc/./_popcountdi2_s.o libgcc/./_paritysi2_s.o libgcc/./_paritydi2_s.o libgcc/./_powisf2_s.o libgcc/./_powidf2_s.o libgcc/./_powixf2_s.o libgcc/./_powitf2_s.o libgcc/./_mulsc3_s.o libgcc/./_muldc3_s.o libgcc/./_mulxc3_s.o libgcc/./_multc3_s.o libgcc/./_divsc3_s.o libgcc/./_divdc3_s.o libgcc/./_divxc3_s.o libgcc/./_divtc3_s.o libgcc/./_bswapsi2_s.o libgcc/./_bswapdi2_s.o libgcc/./_fixunssfsi_s.o libgcc/./_fixunsdfsi_s.o libgcc/./_fixunsxfsi_s.o libgcc/./_fixsfdi_s.o libgcc/./_fixunssfdi_s.o libgcc/./_floatdisf_s.o libgcc/./_floatundisf_s.o libgcc/./_fixdfdi_s.o libgcc/./_fixunsdfdi_s.o libgcc/./_floatdidf_s.o libgcc/./_floatundidf_s.o libgcc/./_fixxfdi_s.o libgcc/./_fixunsxfdi_s.o libgcc/./_floatdixf_s.o libgcc/./_floatundixf_s.o libgcc/./_fixtfdi_s.o libgcc/./_fixunstfdi_s.o libgcc/./_floatditf_s.o libgcc/./_floatunditf_s.o libgcc/./_divdi3_s.o libgcc/./_moddi3_s.o libgcc/./_udivdi3_s.o libgcc/./_umoddi3_s.o libgcc/./_udiv_w_sdiv_s.o libgcc/./_udivmoddi4_s.o libgcc/./unwind-dw2_s.o libgcc/./unwind-dw2-fde_s.o libgcc/./unwind-sjlj_s.o libgcc/./gthr-gnat_s.o libgcc/./unwind-c_s.o -lc && rm -f ./libgcc_s.so && if [ -f ./libgcc_s.so.1 ]; then mv -f ./libgcc_s.so.1 ./libgcc_s.so.1.backup; else true; fi && mv ./libgcc_s.so.1.tmp ./libgcc_s.so.1 && ln -s libgcc_s.so.1 ./libgcc_s.so /usr/local/x86_64-portbld-freebsd9.0/bin/ld: /home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/./gcc/crtbeginS.o: relocation R_X86_64_32 against __cxa_finalize' can not be used when making a shared object; recompile with -fPIC /home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/./gcc/crtbeginS.o: could not read symbols: Bad value collect2: ld returned 1 exit status gmake[4]: *** [libgcc_s.so] Error 1 gmake[4]: Leaving directory/home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/gcc' gmake[3]: [libgcc.a] Error 2 gmake[3]: Leaving directory `/home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj/gcc' gmake[2]: [all-stage1-gcc] Error 2 gmake[2]: Leaving directory /home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj' gmake[1]: *** [stage1-bubble] Error 2 gmake[1]: Leaving directory/home/garga/prs/llvm-gcc4/work/llvm-gcc-4.2-2.8.source/obj' gmake: [all] Error 2 Error code 1

llvmbot commented 13 years ago

Maybe you need to configure with --enable-shared

llvmbot commented 13 years ago

I've added -fPIC to CFLAGS, GXXFLAGS, CPPFLAGS and rebuilt llvm-gcc but it still generate same crtbeginS.o and i got the same error trying to build libgcc. No progress here.

llvmbot commented 13 years ago

Ops, now i read carefully the message, i'm rebuilding llvm-gcc with -fPIC and will test it again. Sorry for the noise.

llvmbot commented 13 years ago

llvm-gcc built faster than I imagine, but i cannot build libgcc with it, i got this error:

building shared library libgcc_s.so.1 /usr/bin/ld: /usr/local/lib/llvm-gcc-2.8/gcc/x86_64-portbld-freebsd9.0/4.2.1/crtbeginS.o: relocation R_X86_64_32 can not be used when making a shared object; recompile with -fPIC /usr/local/lib/llvm-gcc-2.8/gcc/x86_64-portbld-freebsd9.0/4.2.1/crtbeginS.o: could not read symbols: Bad value

llvmbot commented 13 years ago

I'm now building llvm-gcc4, it'll take some time to build, i'll test and post results here tomorrow.

llvmbot commented 13 years ago

It's just a quick way of checking if the trouble is coming from the front-end (clang) or the backend (LLVM).

EdSchouten commented 13 years ago

We hardly use llvm-gcc at FreeBSD, but I'll see if I can persuade Renato to rebuild it.

llvmbot commented 13 years ago

Does llvm-gcc also miscompile it?

EdSchouten commented 13 years ago

It has been reported that it also breaks at -O0, thus likely ruling out optimization issues.

EdSchouten commented 13 years ago

assigned to @asl