llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.02k stars 11.57k forks source link

[clang] Codegen for noexcept scopes impairs program crash analysis #53062

Open serhiihuralniksc opened 2 years ago

serhiihuralniksc commented 2 years ago

Actual behavior Clang/LLVM emits such code for noexcept-labeled functions that makes it impossible to locate a throw statement that leads to program termination due to an attempt to propagate an exception through noexcept frame.

This issue is architecture-independent and related to platforms that follow Itanium EHABI (observed for Linux and Android).

At least GCC does not have this problem.

Details Below is a sample program:

// noexcept.cc 

void a() { throw "boom"; }
void b() { a(); }
void c() { b(); }
void d() { c(); }
void e() { d(); }
void f() { e(); }
void g() { f(); }
void h() noexcept { g(); }

int main() {
    h();
}

This obviously will terminate once exception reaches h().

Here is a crash backtrace for clang-compiled Linux amd64 binary (I'm using lldb but this is unwinder-independent):

$ clang++ -O0 noexcept.cc 
$ lldb a.out -o run -o "thread backtrace"
(lldb) target create "a.out"
Current executable set to '/home/shuralnik/a.out' (x86_64).
(lldb) run
terminate called after throwing an instance of 'char const*'
Process 24213 stopped
* thread #1, name = 'a.out', stop reason = signal SIGABRT
    frame #0: 0x00007ffff70e1fb7 libc.so.6`__GI_raise(sig=2) at raise.c:51

Process 24213 launched: '/home/shuralnik/a.out' (x86_64)
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff70e1fb7 libc.so.6`__GI_raise(sig=2) at raise.c:51
    frame #1: 0x00007ffff70e3921 libc.so.6`__GI_abort at abort.c:79
    frame #2: 0x00007ffff7ad6957 libstdc++.so.6`___lldb_unnamed_symbol24$$libstdc++.so.6 + 89
    frame #3: 0x00007ffff7adcae6 libstdc++.so.6`___lldb_unnamed_symbol287$$libstdc++.so.6 + 6
    frame #4: 0x00007ffff7adcb21 libstdc++.so.6`std::terminate() + 17
    frame #5: 0x00000000004007ef a.out`__clang_call_terminate + 15
    frame #6: 0x00000000004007c3 a.out`h() + 35
    frame #7: 0x00000000004007d9 a.out`main + 9
    frame #8: 0x00007ffff70c4bf7 libc.so.6`__libc_start_main(main=(a.out`main), argc=1, argv=0x00007fffffffdd78, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffdd68) at libc-start.c:310
    frame #9: 0x000000000040063a a.out`_start + 42
(lldb) ^D

App has terminated but the backtrace is useless for any further analysis because the call stack was partially unwound and we were taken away by the cxx runtime from the place where the culprit throw was executed. It is easy to imagine how this affects debuggability of a post-mortem dumps/traces for some complex C++ code that exposes C interface with noexcepts at the boundary to conform to the base platform ABI.

The reason for this lost backtrace is how Clang implements noexcept specifier. An implementation boils down to an implicit try-catch that encloses the whole function code:

try {
    // ...
    // All code in my function that may potentially throw
    // ...
} catch (...) {
    __clang_call_terminate(current_exception);
}

There is an interesting consequence for such implementation. Exception propagation goes in the next way:

So all is done according to the spec:

Non-throwing functions are permitted to call potentially-throwing functions. Whenever an exception is thrown and the search for a handler encounters the outermost block of a non-throwing function, the function std::terminate or std::unexpected (until C++17) is called:

But it appears that this is not the only possible correct implementation and even more - it looks suboptimal in all aspects (binary size, cpu time, debuggability). This is what GCC produces for the same sample:

$ g++ -O0 noexcept.cc 
$ lldb a.out -o run -o "thread backtrace"
(lldb) target create "a.out"
Current executable set to '/home/shuralnik/a.out' (x86_64).
(lldb) run
terminate called after throwing an instance of 'char const*'
Process 24435 stopped
* thread #1, name = 'a.out', stop reason = signal SIGABRT
    frame #0: 0x00007ffff7697fb7 libc.so.6`__GI_raise(sig=2) at raise.c:51

Process 24435 launched: '/home/shuralnik/a.out' (x86_64)
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff7697fb7 libc.so.6`__GI_raise(sig=2) at raise.c:51
    frame #1: 0x00007ffff7699921 libc.so.6`__GI_abort at abort.c:79
    frame #2: 0x00007ffff7ad6957 libstdc++.so.6`___lldb_unnamed_symbol24$$libstdc++.so.6 + 89
    frame #3: 0x00007ffff7adcae6 libstdc++.so.6`___lldb_unnamed_symbol287$$libstdc++.so.6 + 6
    frame #4: 0x00007ffff7adbb49 libstdc++.so.6`___lldb_unnamed_symbol275$$libstdc++.so.6 + 57
    frame #5: 0x00007ffff7adc4b8 libstdc++.so.6`__gxx_personality_v0 + 680
    frame #6: 0x00007ffff70b3573 libgcc_s.so.1`___lldb_unnamed_symbol46$$libgcc_s.so.1 + 83
    frame #7: 0x00007ffff70b3ad1 libgcc_s.so.1`_Unwind_RaiseException + 689
    frame #8: 0x00007ffff7adcd47 libstdc++.so.6`__cxa_throw + 55
    frame #9: 0x0000555555554796 a.out`a() + 44
    frame #10: 0x000055555555479f a.out`b() + 9
    frame #11: 0x00005555555547ab a.out`c() + 9
    frame #12: 0x00005555555547b7 a.out`d() + 9
    frame #13: 0x00005555555547c3 a.out`e() + 9
    frame #14: 0x00005555555547cf a.out`f() + 9
    frame #15: 0x00005555555547db a.out`g() + 9
    frame #16: 0x00005555555547e7 a.out`h() + 9
    frame #17: 0x00005555555547f3 a.out`main + 9
    frame #18: 0x00007ffff767abf7 libc.so.6`__libc_start_main(main=(a.out`main), argc=1, argv=0x00007fffffffdd78, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffdd68) at libc-start.c:310
    frame #19: 0x000055555555468a a.out`_start + 42
(lldb) ^D

Backtrace is preserved and program was terminated w/o partial unwind. They implement noexcept by emitting an empty LSDA for such functions. This leads runtime to terminate at the 1st phase. I believe this behavior is mandated by the ABI. At least libstdc++ and libc++ do so.

Expected behavior Is it possible for Clang to emit such code that will cause runtime to terminate during the 1st phase? Either by leveraging the same approach that GCC does or by using any other corner case that by the ABI should fail handler search before unwind even starts.

Extra Below are assembly listings that demonstrate implementation details more clearly:

// exc.cc
extern int bar(int a);

int foo(int a) noexcept {
    return bar(a);
}

Clang output. Termination landing pad and associated LSDA entry are present.

# $ clang -O2 -S -o - exc.cc
    .text
    .file   "exc.cc"
    .globl  _Z3fooi                 # -- Begin function _Z3fooi
    .p2align    4, 0x90
    .type   _Z3fooi,@function
_Z3fooi:                                # @_Z3fooi
.Lfunc_begin0:
    .cfi_startproc
    .cfi_personality 3, __gxx_personality_v0
    .cfi_lsda 3, .Lexception0
# %bb.0:
    pushq   %rax
    .cfi_def_cfa_offset 16
.Ltmp0:
    callq   _Z3bari
.Ltmp1:
# %bb.1:
    popq    %rcx
    .cfi_def_cfa_offset 8
    retq
.LBB0_2:
    .cfi_def_cfa_offset 16
.Ltmp2:
    movq    %rax, %rdi
    callq   __clang_call_terminate
.Lfunc_end0:
    .size   _Z3fooi, .Lfunc_end0-_Z3fooi
    .cfi_endproc
    .section    .gcc_except_table,"a",@progbits
    .p2align    2
GCC_except_table0:
.Lexception0:
    .byte   255                     # @LPStart Encoding = omit
    .byte   3                       # @TType Encoding = udata4
    .uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:
    .byte   1                       # Call site Encoding = uleb128
    .uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
    .uleb128 .Ltmp0-.Lfunc_begin0   # >> Call Site 1 <<
    .uleb128 .Ltmp1-.Ltmp0          #   Call between .Ltmp0 and .Ltmp1
    .uleb128 .Ltmp2-.Lfunc_begin0   #     jumps to .Ltmp2
    .byte   1                       #   On action: 1
.Lcst_end0:
    .byte   1                       # >> Action Record 1 <<
                                        #   Catch TypeInfo 1
    .byte   0                       #   No further actions
    .p2align    2
                                        # >> Catch TypeInfos <<
    .long   0                       # TypeInfo 1
.Lttbase0:
    .p2align    2
                                        # -- End function
    .section    .text.__clang_call_terminate,"axG",@progbits,__clang_call_terminate,comdat
    .hidden __clang_call_terminate  # -- Begin function __clang_call_terminate
    .weak   __clang_call_terminate
    .p2align    4, 0x90
    .type   __clang_call_terminate,@function
__clang_call_terminate:                 # @__clang_call_terminate
# %bb.0:
    pushq   %rax
    callq   __cxa_begin_catch
    callq   _ZSt9terminatev
.Lfunc_end1:
    .size   __clang_call_terminate, .Lfunc_end1-__clang_call_terminate
                                        # -- End function
    .ident  "clang version 10.0.0-4ubuntu1~18.04.2 "
    .section    ".note.GNU-stack","",@progbits
    .addrsig
    .addrsig_sym __gxx_personality_v0

GCC output. No pads, only LSDA with an empty call-site table.

# $ g++ -O2 -S -o - exc.cc
    .file   "exc.cc"
    .text
    .p2align 4,,15
    .globl  _Z3fooi
    .type   _Z3fooi, @function
_Z3fooi:
.LFB0:
    .cfi_startproc
    .cfi_personality 0x9b,DW.ref.__gxx_personality_v0
    .cfi_lsda 0x1b,.LLSDA0
    jmp _Z3bari@PLT
    .cfi_endproc
.LFE0:
    .globl  __gxx_personality_v0
    .section    .gcc_except_table,"a",@progbits
.LLSDA0:
    .byte   0xff
    .byte   0xff
    .byte   0x1
    .uleb128 .LLSDACSE0-.LLSDACSB0
.LLSDACSB0:
.LLSDACSE0:
    .text
    .size   _Z3fooi, .-_Z3fooi
    .hidden DW.ref.__gxx_personality_v0
    .weak   DW.ref.__gxx_personality_v0
    .section    .data.rel.local.DW.ref.__gxx_personality_v0,"awG",@progbits,DW.ref.__gxx_personality_v0,comdat
    .align 8
    .type   DW.ref.__gxx_personality_v0, @object
    .size   DW.ref.__gxx_personality_v0, 8
DW.ref.__gxx_personality_v0:
    .quad   __gxx_personality_v0
    .ident  "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
    .section    .note.GNU-stack,"",@progbits
hiraditya commented 2 years ago

Seems like there was a fix posted recently by @jyknight Does https://reviews.llvm.org/D113620 fix the issue for you?

serhiihuralniksc commented 2 years ago

@hiraditya thanks for a reply. Not really. That change looks just like an optimization to speed up program termination by the cost of ignoring cleanups for noexcept frame. It doesn't change overall try-catch-all-terminate approach for noexcept frames.

jyknight commented 2 years ago

I'm actively working on a proposal/implementation towards this. I expect to post a proposal to llvm-dev real-soon-now.

I'd note that GCC also doesn't reliably have the desired behavior, see https://gcc.gnu.org/PR55918

serhiihuralniksc commented 2 years ago

@jyknight having this fixed would be just amazing!

Do you mind to post a link here once you start discussion in llvm-dev?

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-codegen

ldionne commented 1 year ago

I think this is the discussion: https://discourse.llvm.org/t/rfc-add-call-unwindabort-to-llvm-ir/62543

serhiihuralniksc commented 4 months ago

@jyknight Seems like the work on unwindabort may take some time. I came to a sort of temporary workaround that may be interesting given that there seems to be a real demand to get better stacktraces on attempt to throw through noexcept frames.

The idea here is that we use another channel to let runtime lib to know that throw must be failed. Now it is encoded not via LSDA but via personality routine, Namely by using a thin wrapper on top of normal C++ personality routine that alters returns code to such one that terminates unwinding if otherwise it would try to unwind through this frame:

_Unwind_Reason_Code __gxx_personality_v0(<args>);

_Unwind_Reason_Code __clang_noexcept_wrapper_for___gxx_personality_v0(<args>) {
    auto urc = __gxx_personality_v0(<forwarded_args>);
    if (urc == _URC_CONTINUE_UNWIND) // Maybe also need some `action` test to ensure it is phase1,
                                           // at phase 2 this should never be called at all
        return _URC_FATAL_PHASE1_ERROR; // Or _URC_END_OF_STACK?
    return urc;
}

Personality wrapper is intended to be used instead of normal personality for all noexcept functions:


void bar() noexcept { // define void @_Z3barv() #0 personality ptr @__clang_noexcept_wrapper_for___gxx_personality_v0 {
    // ...
}

Cons - this at least will prohibit inlining of other C++ functions in such noexcept ones (if inlined one does have associated personality routine which is now different from what noexcept uses). Likely there are other issues that I may miss, not sure.

Pros: binary size needed to implement this is minimal and is amortized immediately by using a single instance of personality wrapper, which in turn can be emitted by using the same model as __clang_call_terminate, i.e. hidden linkonce with comdat. There is no dependency on LSDA format, plus noexcept behavior is implemented independently from normal EH structures.

Obviously it only works for Itanium ABI platforms but still may be interesting.

Intentionally posted it here instead of https://discourse.llvm.org/t/rfc-add-call-unwindabort-to-llvm-ir/62543 as it is irrelevant to unwindabort work.

Any thoughts?

serhiihuralniksc commented 3 weeks ago

@ldionne, following our brief offline discussion earlier this year about my last comment (appreciate the opportunity) - we've figured out that an approach with a special personality routine that implements noexcept semantics may often be non applicable for mach-o binaries. The reason is that Apple' compact unwind info format (which appears to be a linker' default) allows to have no more that 4 different personality routines per dynamic object (unwind instructions use 2-bit field to index global personality array). We've hit this limit in one of our iOS applications when adding a noexcept shim function with custom personality:

ld: too many personality routines for compact unwind to encode for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

And it is very likely that many more complex apps (that are written in multiple languages) would hit it as well. Currently we can not afford switching off compact unwind info and decided to use standard cxx personality for our noexcept shim function that now has manually crafted LSDA with an empty callsite table that causes desired phase I error.

So that my suggestion about a new noexcept personality is fully applicable to ELFs only where unwind info formats are way less restrictive to the number of personalities.