llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.37k stars 12.14k forks source link

Inliner incorrectly combines cleanup and catch landing pads #14488

Open davidchisnall opened 12 years ago

davidchisnall commented 12 years ago
Bugzilla Link 14116
Version trunk
OS All
Blocks llvm/llvm-project#14265
CC @rjmccall

Extended Description

The LLVM inliner combines catch and cleanups into a single landing pad and then merges their tails. This is problematic, because some personality functions (e.g. GNU Objective-C) return different values in exception register 0 depending on whether the landing pad is a catch or a cleanup. Consider the following example:

define i32 @​x() uwtable { entry: %retval = alloca i32, align 4 %x = alloca i32, align 4 %exn.slot = alloca i8 %ehselector.slot = alloca i32 invoke void @​objc_exception_throw(i8 null) to label %invoke.cont unwind label %lpad

invoke.cont: ; preds = %entry unreachable

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup %1 = extractvalue { i8, i32 } %0, 0 store i8 %1, i8* %exn.slot %2 = extractvalue { i8, i32 } %0, 1 store i32 %2, i32 %ehselector.slot %3 = bitcast i32 %x to i8 invoke void @​foo(i8 %3) to label %invoke.cont1 unwind label %terminate.lpad

invoke.cont1: ; preds = %lpad br label %eh.resume

return: ; No predecessors! %4 = load i32* %retval ret i32 %4

eh.resume: ; preds = %invoke.cont1 %exn = load i8* %exn.slot %sel = load i32 %ehselector.slot %lpad.val = insertvalue { i8, i32 } undef, i8 %exn, 0 %lpad.val2 = insertvalue { i8, i32 } %lpad.val, i32 %sel, 1 resume { i8, i32 } %lpad.val2

terminate.lpad: ; preds = %lpad %5 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8* null call void @​abort() noreturn nounwind unreachable }

declare void @​foo(i8*)

declare void @​objc_exception_throw(i8*)

declare i32 @​__gnu_objc_personality_v0(...)

declare void @​abort()

define i32 @​y() uwtable { entry: %finally.exn = alloca i8 %finally.for-eh = alloca i1 %exn.slot = alloca i8 %ehselector.slot = alloca i32 %cleanup.dest.slot = alloca i32 store i1 false, i1* %finally.for-eh %call = invoke i32 @​x() to label %invoke.cont unwind label %lpad

invoke.cont: ; preds = %entry store i32 0, i32* %cleanup.dest.slot br label %cleanup

cleanup: ; preds = %invoke.cont, %finally.catchall %cleanup.dest.saved = load i32 %cleanup.dest.slot %0 = load i32 @​z, align 4 %inc = add nsw i32 %0, 1 store i32 %inc, i32 @​z, align 4 %finally.shouldthrow = load i1 %finally.for-eh br i1 %finally.shouldthrow, label %finally.rethrow, label %finally.cont

finally.rethrow: ; preds = %cleanup %1 = load i8* %finally.exn call void @​objc_exception_throw(i8 %1) unreachable

finally.cont: ; preds = %cleanup store i32 %cleanup.dest.saved, i32 %cleanup.dest.slot %cleanup.dest = load i32 %cleanup.dest.slot switch i32 %cleanup.dest, label %unreachable [ i32 0, label %cleanup.cont i32 2, label %unreachable ]

cleanup.cont: ; preds = %finally.cont ret i32 1

lpad: ; preds = %entry %2 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8 null %3 = extractvalue { i8, i32 } %2, 0 store i8* %3, i8* %exn.slot %4 = extractvalue { i8, i32 } %2, 1 store i32 %4, i32* %ehselector.slot br label %finally.catchall

finally.catchall: ; preds = %lpad %exn = load i8* %exn.slot store i8 %exn, i8* %finally.exn store i1 true, i1 %finally.for-eh store i32 2, i32* %cleanup.dest.slot br label %cleanup

unreachable: ; preds = %finally.cont, %finally.cont unreachable }

The inner function, x(), has a cleanup. The outer has a catch. The cleanup rethrows the exception with a resume instruction. The outer function, y(), has a catchall. This rethrows the exception with objc_exception_throw().

When the inliner has run, the following version of y() is produced:

define i32 @​y() uwtable optsize { entry: %x.i = alloca i32, align 4 %0 = bitcast i32 %x.i to i8 call void @​llvm.lifetime.start(i64 -1, i8 %0) invoke void @​objc_exception_throw(i8 null) to label %invoke.cont.i unwind label %lpad.i

invoke.cont.i: ; preds = %entry unreachable

lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup catch i8 null invoke void @​foo(i8 %0) optsize to label %finally.rethrow unwind label %terminate.lpad.i

terminate.lpad.i: ; preds = %lpad.i %2 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8* null call void @​abort() noreturn nounwind unreachable

finally.rethrow: ; preds = %lpad.i %3 = extractvalue { i8, i32 } %1, 0 %4 = load i32 @​z, align 4, !tbaa !​0 %inc = add nsw i32 %4, 1 store i32 %inc, i32 @​z, align 4, !tbaa !​0 call void @​objc_exception_throw(i8 %3) unreachable }

x() has been inlined into the first basic block, so the invoke of the function throwing the exception (objc_exception_throw()) becomes the terminator for that block. So far, so good. The landing pad for this now has a cleanup and then a catchall, which is a bit redundant because either of these will always be matched. The call to foo() in the landing pad is the cleanup code from x(), which is correct. This then branches to finally.rethrow, however, which then calls objc_exception_throw(). This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup.

The resume instruction has been completely removed. The correct behaviour in this case would be simply to remove the cleanup - there is a catchall and so the cleanup is completely redundant. In the more general case, if we have multiple rethrow blocks with different mechanisms then we should be wrapping them in a branch. The finally.rethrow block in this case should look like something like this:

finally.rethrow: ; preds = %lpad.i %3 = extractvalue { i8, i32 } %1, 0 %4 = load i32 @​z, align 4, !tbaa !​0 %inc = add nsw i32 %4, 1 store i32 %inc, i32 @​z, align 4, !tbaa !​0 %5 = extractvalue { i8, i32 } %1, 1 switch i32 %5, label %unreachable [ i32 0, label %cleanup.cont i32 1, label %rethrow ]

rethrow: call void @​objc_exception_throw(i8* %3) unreachable

cleanup.cont: resume { i8*, i32 } %1

davidchisnall commented 3 years ago

mentioned in issue llvm/llvm-project#7559

rjmccall commented 12 years ago

It's a question of ordering. Personality functions will jump to the first matching selector. It is up to the compiler to ensure that the selectors are ordered from most specific to most general (unless they were ordered in a different way in the source language). This means that a catchall and a cleanup should always be ordered with the catchall first in the case where both are present. Doing otherwise would result in a significant performance hit (I speak here as someone who has implemented the personality functions for three languages, including C++) and be unacceptable.

You are basically claiming that GCC's LSDA layout requires cleanup actions to always be the outermost action for a landing pad. I am quite skeptical; among other things, this would make it impossible to share action-table entries between the landing pads in code like this: try { foo(); std::string s; // needs a cleanup bar(); } catch (E1 *e) {} At the very least, I'd like to see some evidence from the GCC documentation that this is intended.

I am also skeptical that it is a significant performance hit when interpreting the actions table to simply remember that you've seen a cleanup action and keep scanning. Among other things, interpreting another entry in the actions table is quite fast compared to everything else the personality is doing — e.g. finding the appropriate lpad in the first place.

Indeed - the ABI spec clearly says that the first match should be taken, because a more general match may follow, e.g. an NSString followed by an NSObject followed by an id catch in Objective-C.

If you're talking about the Itanium ABI's suggested implementation, (1) it's clearly talking about handlers and (2) it doesn't talk about GCC's LSDA layout.

This inlining places the cleanup before the catchall, and so results in the block being entered via the cleanup. The personality function then enters the first matching rule: the cleanup.

The IR has no concept of 'ordering' a cleanup w.r.t. handler clauses; it's just a flag on the landingpad instruction. That's completely fine.

Now, I could implement work-arounds for the LLVM behaviour in the personality functions that I maintain, but that wouldn't make us work with personality functions that don't implement this hack, and would make all LLVM-compatible personality functions slower than their LLVM-incompatible counterparts.

For what it's worth, what I said above about just flagging whether you saw a cleanup is exactly what GCC's C++ personality function does. I really don't think you're relying on specified behavior here.

davidchisnall commented 12 years ago

Hi John,

David, I'm worried about the line "This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup." This sounds like a bug in the personality function. The personality function is required to not merely decide whether to land, but to correctly decide what the landing pad's action should be; landing with a selector value of 0 is a statement that it's landing there only to run cleanups, which is obviously not the case.

It's a question of ordering. Personality functions will jump to the first matching selector. It is up to the compiler to ensure that the selectors are ordered from most specific to most general (unless they were ordered in a different way in the source language). This means that a catchall and a cleanup should always be ordered with the catchall first in the case where both are present. Doing otherwise would result in a significant performance hit (I speak here as someone who has implemented the personality functions for three languages, including C++) and be unacceptable. Indeed - the ABI spec clearly says that the first match should be taken, because a more general match may follow, e.g. an NSString followed by an NSObject followed by an id catch in Objective-C.

This inlining places the cleanup before the catchall, and so results in the block being entered via the cleanup. The personality function then enters the first matching rule: the cleanup.

Now, I could implement work-arounds for the LLVM behaviour in the personality functions that I maintain, but that wouldn't make us work with personality functions that don't implement this hack, and would make all LLVM-compatible personality functions slower than their LLVM-incompatible counterparts.

rjmccall commented 12 years ago

The later inlinings look like bugs, but the first inlining actually looks totally reasonable to me.

David, I'm worried about the line "This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup." This sounds like a bug in the personality function. The personality function is required to not merely decide whether to land, but to correctly decide what the landing pad's action should be; landing with a selector value of 0 is a statement that it's landing there only to run cleanups, which is obviously not the case.

Now, arguably the backend should make a point of not saying that there are cleanups when there's a catchall handler in effect, but that doesn't really fix the personality bug, because the personality would still be returning the wrong answer for a non-total catch.

llvmbot commented 12 years ago

Comprehensive testcase for what the inliner is supposed to do OK, here is a testcase ready to be dropped into the testsuite. It checks all the possible combinations of cleanup and/or catch on an invoke of a function which itself has an invoke with a cleanup and/or catch. The CHECK lines show how it is supposed to work. But in fact it's not working at all!

llvmbot commented 12 years ago

Something seems to be horribly broken in the inliner...

davidchisnall commented 12 years ago

IR that is incorrectly inlined

llvmbot commented 12 years ago

Can you please attach the complete (modified) IR.

davidchisnall commented 12 years ago

Modified like this:

define i32 @​main() nounwind uwtable { entry: %call = invoke i32 @​finally() to label %return unwind label %lpad

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8 null %selector = extractvalue { i8, i32 } %0, 1 %typeid = call i32 @​llvm.eh.typeid.for(i8* null) %is_catch_all = icmp eq i32 %selector, %typeid br i1 %is_catch_all, label %return, label %rethrow

return: ; preds = %entry, %lpad ret i32 0

rethrow: resume { i8*, i32 } %0 }

THe result is still that the optimiser removes the ret, and makes the catchall->stop into a cleanup->resume:

define i32 @​main() nounwind uwtable { entry: %0 = load i8* @​except, align 8, !tbaa !​1 invoke void @​objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i

invoke.cont.i: ; preds = %entry unreachable

lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @​cleanupRun, align 4, !tbaa !​0 resume { i8, i32 } %1 }

llvmbot commented 12 years ago

I meant something like this:

Original:

define i32 @​main() nounwind uwtable { entry: %call = invoke i32 @​finally() to label %return unwind label %lpad

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8* null br label %return

return: ; preds = %entry, %lpad ret i32 0 }

Modified:

define i32 @​main() nounwind uwtable { entry: %call = invoke i32 @​finally() to label %return unwind label %lpad

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8 null %selector = extractvalue { i8, i32 } %0, 1 %typeid = call i32 @​llvm.eh.typeid.for(i8* 0) %is_catch_all = icmp eq i32 %selector, %typeid br %is_catch_all, label %return, label %resume

resume: resume { i8*, i32 } %0

return: ; preds = %entry, %lpad ret i32 0 }

davidchisnall commented 12 years ago

I am not 100% sure what the correct behaviour should be. It's a catchall, so it should never be resuming even after inlining. Even if it is merged with other handlers, then the catchall code should run unconditionally. I tried modifying it like this:

define i32 @​main() nounwind uwtable { entry: %call = invoke i32 @​finally() to label %return unwind label %lpad

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8 null %1 = extractvalue { i8, i32 } %0, 1 %2 = icmp eq i32 0, %1 br i1 %2, label %return, label %rethrow

return: ; preds = %entry, %lpad ret i32 0

rethrow: resume { i8*, i32 } %0 }

This code is now wrong, because the catchall is a barrier beyond which no exceptions should propagate, however even with this modification, the same incorrect results occur during optimisation. It is transformed into the following, which unconditionally propagates exceptions:

define i32 @​main() nounwind uwtable { entry: %0 = load i8* @​except, align 8, !tbaa !​1 invoke void @​objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i

invoke.cont.i: ; preds = %entry unreachable

lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @​cleanupRun, align 4, !tbaa !​0 resume { i8, i32 } %1 }

llvmbot commented 12 years ago

"This should not be required for something that is a catchall - by definition, it catches every kind of exception and should not resume". You misunderstood what the docs are saying. The docs say that the catch-all handler, like every other handler, should check that the selector value was "catch-all" before running the catch-all code. It's not about stuff that happens after the catch all, it's about stuff that may happen before it after inlining, eg the cleanup running. If you modify the IR in this way, does it then work?

davidchisnall commented 12 years ago

This should not be required for something that is a catchall - by definition, it catches every kind of exception and should not resume. I have a slightly more worrying example of error here:

define i32 @​finally() noreturn uwtable { entry: %0 = load i8* @​except, align 8, !tbaa !​1 invoke void @​objc_exception_throw(i8 %0) noreturn to label %invoke.cont unwind label %lpad

invoke.cont: ; preds = %entry unreachable

lpad: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup tail call void @​runCleanup(i8 undef) resume { i8, i32 } %1 }

declare void @​objc_exception_throw(i8*)

declare i32 @​__gnu_objc_personality_v0(...)

define i32 @​main() nounwind uwtable { entry: %call = invoke i32 @​finally() to label %return unwind label %lpad

lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) catch i8* null br label %return

return: ; preds = %entry, %lpad ret i32 0 }

The inner function has a cleanup, the outer function a catchall. The correct behaviour for this program is to exit with a status 0. After optimisation, we get:

define i32 @​main() nounwind uwtable { entry: %0 = load i8* @​except, align 8, !tbaa !​1 invoke void @​objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i

invoke.cont.i: ; preds = %entry unreachable

lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @​__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @​cleanupRun, align 4, !tbaa !​0 resume { i8, i32 } %1 }

The cleanup landing pad has been moved to the outer function, and the catchall is gone completely. This program now aborts at run time because the unwinder finds only cleanups before falling off the end of the stack.

llvmbot commented 12 years ago

About the "there's a catch-all so the cleanup is redundant", check out RecognizePersonality and isCatchAll in InstructionCombining.cpp. They seem to expect objc_personality_v0 not gnu_objc_personality_v0.

As for the rest of the bug report, at a glance the IR is wrong since it doesn't conform to the http://llvm.org/docs/ExceptionHandling.html#restrictions, namely this bit:

"In order for inlining to behave correctly, landing pads must be prepared to handle selector results that they did not originally advertise. Suppose that a function catches exceptions of type A, and it’s inlined into a function that catches exceptions of type B. The inliner will update the landingpad instruction for the inlined landing pad to include the fact that B is also caught. If that landing pad assumes that it will only be entered to catch an A, it’s in for a rude awakening. Consequently, landing pads must test for the selector results they understand and then resume exception propagation with the resume instruction if none of the conditions match."