Open davidchisnall opened 12 years ago
mentioned in issue llvm/llvm-project#7559
It's a question of ordering. Personality functions will jump to the first matching selector. It is up to the compiler to ensure that the selectors are ordered from most specific to most general (unless they were ordered in a different way in the source language). This means that a catchall and a cleanup should always be ordered with the catchall first in the case where both are present. Doing otherwise would result in a significant performance hit (I speak here as someone who has implemented the personality functions for three languages, including C++) and be unacceptable.
You are basically claiming that GCC's LSDA layout requires cleanup actions to always be the outermost action for a landing pad. I am quite skeptical; among other things, this would make it impossible to share action-table entries between the landing pads in code like this: try { foo(); std::string s; // needs a cleanup bar(); } catch (E1 *e) {} At the very least, I'd like to see some evidence from the GCC documentation that this is intended.
I am also skeptical that it is a significant performance hit when interpreting the actions table to simply remember that you've seen a cleanup action and keep scanning. Among other things, interpreting another entry in the actions table is quite fast compared to everything else the personality is doing — e.g. finding the appropriate lpad in the first place.
Indeed - the ABI spec clearly says that the first match should be taken, because a more general match may follow, e.g. an NSString followed by an NSObject followed by an id catch in Objective-C.
If you're talking about the Itanium ABI's suggested implementation, (1) it's clearly talking about handlers and (2) it doesn't talk about GCC's LSDA layout.
This inlining places the cleanup before the catchall, and so results in the block being entered via the cleanup. The personality function then enters the first matching rule: the cleanup.
The IR has no concept of 'ordering' a cleanup w.r.t. handler clauses; it's just a flag on the landingpad instruction. That's completely fine.
Now, I could implement work-arounds for the LLVM behaviour in the personality functions that I maintain, but that wouldn't make us work with personality functions that don't implement this hack, and would make all LLVM-compatible personality functions slower than their LLVM-incompatible counterparts.
For what it's worth, what I said above about just flagging whether you saw a cleanup is exactly what GCC's C++ personality function does. I really don't think you're relying on specified behavior here.
Hi John,
David, I'm worried about the line "This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup." This sounds like a bug in the personality function. The personality function is required to not merely decide whether to land, but to correctly decide what the landing pad's action should be; landing with a selector value of 0 is a statement that it's landing there only to run cleanups, which is obviously not the case.
It's a question of ordering. Personality functions will jump to the first matching selector. It is up to the compiler to ensure that the selectors are ordered from most specific to most general (unless they were ordered in a different way in the source language). This means that a catchall and a cleanup should always be ordered with the catchall first in the case where both are present. Doing otherwise would result in a significant performance hit (I speak here as someone who has implemented the personality functions for three languages, including C++) and be unacceptable. Indeed - the ABI spec clearly says that the first match should be taken, because a more general match may follow, e.g. an NSString followed by an NSObject followed by an id catch in Objective-C.
This inlining places the cleanup before the catchall, and so results in the block being entered via the cleanup. The personality function then enters the first matching rule: the cleanup.
Now, I could implement work-arounds for the LLVM behaviour in the personality functions that I maintain, but that wouldn't make us work with personality functions that don't implement this hack, and would make all LLVM-compatible personality functions slower than their LLVM-incompatible counterparts.
The later inlinings look like bugs, but the first inlining actually looks totally reasonable to me.
David, I'm worried about the line "This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup." This sounds like a bug in the personality function. The personality function is required to not merely decide whether to land, but to correctly decide what the landing pad's action should be; landing with a selector value of 0 is a statement that it's landing there only to run cleanups, which is obviously not the case.
Now, arguably the backend should make a point of not saying that there are cleanups when there's a catchall handler in effect, but that doesn't really fix the personality bug, because the personality would still be returning the wrong answer for a non-total catch.
Comprehensive testcase for what the inliner is supposed to do OK, here is a testcase ready to be dropped into the testsuite. It checks all the possible combinations of cleanup and/or catch on an invoke of a function which itself has an invoke with a cleanup and/or catch. The CHECK lines show how it is supposed to work. But in fact it's not working at all!
Something seems to be horribly broken in the inliner...
Can you please attach the complete (modified) IR.
Modified like this:
define i32 @main() nounwind uwtable { entry: %call = invoke i32 @finally() to label %return unwind label %lpad
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8 null %selector = extractvalue { i8, i32 } %0, 1 %typeid = call i32 @llvm.eh.typeid.for(i8* null) %is_catch_all = icmp eq i32 %selector, %typeid br i1 %is_catch_all, label %return, label %rethrow
return: ; preds = %entry, %lpad ret i32 0
rethrow: resume { i8*, i32 } %0 }
THe result is still that the optimiser removes the ret, and makes the catchall->stop into a cleanup->resume:
define i32 @main() nounwind uwtable { entry: %0 = load i8* @except, align 8, !tbaa !1 invoke void @objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i
invoke.cont.i: ; preds = %entry unreachable
lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @cleanupRun, align 4, !tbaa !0 resume { i8, i32 } %1 }
I meant something like this:
Original:
define i32 @main() nounwind uwtable { entry: %call = invoke i32 @finally() to label %return unwind label %lpad
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8* null br label %return
return: ; preds = %entry, %lpad ret i32 0 }
Modified:
define i32 @main() nounwind uwtable { entry: %call = invoke i32 @finally() to label %return unwind label %lpad
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8 null %selector = extractvalue { i8, i32 } %0, 1 %typeid = call i32 @llvm.eh.typeid.for(i8* 0) %is_catch_all = icmp eq i32 %selector, %typeid br %is_catch_all, label %return, label %resume
resume: resume { i8*, i32 } %0
return: ; preds = %entry, %lpad ret i32 0 }
I am not 100% sure what the correct behaviour should be. It's a catchall, so it should never be resuming even after inlining. Even if it is merged with other handlers, then the catchall code should run unconditionally. I tried modifying it like this:
define i32 @main() nounwind uwtable { entry: %call = invoke i32 @finally() to label %return unwind label %lpad
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8 null %1 = extractvalue { i8, i32 } %0, 1 %2 = icmp eq i32 0, %1 br i1 %2, label %return, label %rethrow
return: ; preds = %entry, %lpad ret i32 0
rethrow: resume { i8*, i32 } %0 }
This code is now wrong, because the catchall is a barrier beyond which no exceptions should propagate, however even with this modification, the same incorrect results occur during optimisation. It is transformed into the following, which unconditionally propagates exceptions:
define i32 @main() nounwind uwtable { entry: %0 = load i8* @except, align 8, !tbaa !1 invoke void @objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i
invoke.cont.i: ; preds = %entry unreachable
lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @cleanupRun, align 4, !tbaa !0 resume { i8, i32 } %1 }
"This should not be required for something that is a catchall - by definition, it catches every kind of exception and should not resume". You misunderstood what the docs are saying. The docs say that the catch-all handler, like every other handler, should check that the selector value was "catch-all" before running the catch-all code. It's not about stuff that happens after the catch all, it's about stuff that may happen before it after inlining, eg the cleanup running. If you modify the IR in this way, does it then work?
This should not be required for something that is a catchall - by definition, it catches every kind of exception and should not resume. I have a slightly more worrying example of error here:
define i32 @finally() noreturn uwtable { entry: %0 = load i8* @except, align 8, !tbaa !1 invoke void @objc_exception_throw(i8 %0) noreturn to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry unreachable
lpad: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup tail call void @runCleanup(i8 undef) resume { i8, i32 } %1 }
declare void @objc_exception_throw(i8*)
declare i32 @__gnu_objc_personality_v0(...)
define i32 @main() nounwind uwtable { entry: %call = invoke i32 @finally() to label %return unwind label %lpad
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8* null br label %return
return: ; preds = %entry, %lpad ret i32 0 }
The inner function has a cleanup, the outer function a catchall. The correct behaviour for this program is to exit with a status 0. After optimisation, we get:
define i32 @main() nounwind uwtable { entry: %0 = load i8* @except, align 8, !tbaa !1 invoke void @objc_exception_throw(i8 %0) noreturn to label %invoke.cont.i unwind label %lpad.i
invoke.cont.i: ; preds = %entry unreachable
lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup store i32 1, i32 @cleanupRun, align 4, !tbaa !0 resume { i8, i32 } %1 }
The cleanup landing pad has been moved to the outer function, and the catchall is gone completely. This program now aborts at run time because the unwinder finds only cleanups before falling off the end of the stack.
About the "there's a catch-all so the cleanup is redundant", check out RecognizePersonality and isCatchAll in InstructionCombining.cpp. They seem to expect objc_personality_v0 not gnu_objc_personality_v0.
As for the rest of the bug report, at a glance the IR is wrong since it doesn't conform to the http://llvm.org/docs/ExceptionHandling.html#restrictions, namely this bit:
"In order for inlining to behave correctly, landing pads must be prepared to handle selector results that they did not originally advertise. Suppose that a function catches exceptions of type A, and it’s inlined into a function that catches exceptions of type B. The inliner will update the landingpad instruction for the inlined landing pad to include the fact that B is also caught. If that landing pad assumes that it will only be entered to catch an A, it’s in for a rude awakening. Consequently, landing pads must test for the selector results they understand and then resume exception propagation with the resume instruction if none of the conditions match."
Extended Description
The LLVM inliner combines catch and cleanups into a single landing pad and then merges their tails. This is problematic, because some personality functions (e.g. GNU Objective-C) return different values in exception register 0 depending on whether the landing pad is a catch or a cleanup. Consider the following example:
define i32 @x() uwtable { entry: %retval = alloca i32, align 4 %x = alloca i32, align 4 %exn.slot = alloca i8 %ehselector.slot = alloca i32 invoke void @objc_exception_throw(i8 null) to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry unreachable
lpad: ; preds = %entry %0 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup %1 = extractvalue { i8, i32 } %0, 0 store i8 %1, i8* %exn.slot %2 = extractvalue { i8, i32 } %0, 1 store i32 %2, i32 %ehselector.slot %3 = bitcast i32 %x to i8 invoke void @foo(i8 %3) to label %invoke.cont1 unwind label %terminate.lpad
invoke.cont1: ; preds = %lpad br label %eh.resume
return: ; No predecessors! %4 = load i32* %retval ret i32 %4
eh.resume: ; preds = %invoke.cont1 %exn = load i8* %exn.slot %sel = load i32 %ehselector.slot %lpad.val = insertvalue { i8, i32 } undef, i8 %exn, 0 %lpad.val2 = insertvalue { i8, i32 } %lpad.val, i32 %sel, 1 resume { i8, i32 } %lpad.val2
terminate.lpad: ; preds = %lpad %5 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8* null call void @abort() noreturn nounwind unreachable }
declare void @foo(i8*)
declare void @objc_exception_throw(i8*)
declare i32 @__gnu_objc_personality_v0(...)
declare void @abort()
define i32 @y() uwtable { entry: %finally.exn = alloca i8 %finally.for-eh = alloca i1 %exn.slot = alloca i8 %ehselector.slot = alloca i32 %cleanup.dest.slot = alloca i32 store i1 false, i1* %finally.for-eh %call = invoke i32 @x() to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry store i32 0, i32* %cleanup.dest.slot br label %cleanup
cleanup: ; preds = %invoke.cont, %finally.catchall %cleanup.dest.saved = load i32 %cleanup.dest.slot %0 = load i32 @z, align 4 %inc = add nsw i32 %0, 1 store i32 %inc, i32 @z, align 4 %finally.shouldthrow = load i1 %finally.for-eh br i1 %finally.shouldthrow, label %finally.rethrow, label %finally.cont
finally.rethrow: ; preds = %cleanup %1 = load i8* %finally.exn call void @objc_exception_throw(i8 %1) unreachable
finally.cont: ; preds = %cleanup store i32 %cleanup.dest.saved, i32 %cleanup.dest.slot %cleanup.dest = load i32 %cleanup.dest.slot switch i32 %cleanup.dest, label %unreachable [ i32 0, label %cleanup.cont i32 2, label %unreachable ]
cleanup.cont: ; preds = %finally.cont ret i32 1
lpad: ; preds = %entry %2 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8 null %3 = extractvalue { i8, i32 } %2, 0 store i8* %3, i8* %exn.slot %4 = extractvalue { i8, i32 } %2, 1 store i32 %4, i32* %ehselector.slot br label %finally.catchall
finally.catchall: ; preds = %lpad %exn = load i8* %exn.slot store i8 %exn, i8* %finally.exn store i1 true, i1 %finally.for-eh store i32 2, i32* %cleanup.dest.slot br label %cleanup
unreachable: ; preds = %finally.cont, %finally.cont unreachable }
The inner function, x(), has a cleanup. The outer has a catch. The cleanup rethrows the exception with a resume instruction. The outer function, y(), has a catchall. This rethrows the exception with objc_exception_throw().
When the inliner has run, the following version of y() is produced:
define i32 @y() uwtable optsize { entry: %x.i = alloca i32, align 4 %0 = bitcast i32 %x.i to i8 call void @llvm.lifetime.start(i64 -1, i8 %0) invoke void @objc_exception_throw(i8 null) to label %invoke.cont.i unwind label %lpad.i
invoke.cont.i: ; preds = %entry unreachable
lpad.i: ; preds = %entry %1 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) cleanup catch i8 null invoke void @foo(i8 %0) optsize to label %finally.rethrow unwind label %terminate.lpad.i
terminate.lpad.i: ; preds = %lpad.i %2 = landingpad { i8, i32 } personality i8 bitcast (i32 (...) @__gnu_objc_personality_v0 to i8) catch i8* null call void @abort() noreturn nounwind unreachable
finally.rethrow: ; preds = %lpad.i %3 = extractvalue { i8, i32 } %1, 0 %4 = load i32 @z, align 4, !tbaa !0 %inc = add nsw i32 %4, 1 store i32 %inc, i32 @z, align 4, !tbaa !0 call void @objc_exception_throw(i8 %3) unreachable }
x() has been inlined into the first basic block, so the invoke of the function throwing the exception (objc_exception_throw()) becomes the terminator for that block. So far, so good. The landing pad for this now has a cleanup and then a catchall, which is a bit redundant because either of these will always be matched. The call to foo() in the landing pad is the cleanup code from x(), which is correct. This then branches to finally.rethrow, however, which then calls objc_exception_throw(). This call is only valid if we landed with the catchall, but because a cleanup will always be matched first we actually landed with the cleanup.
The resume instruction has been completely removed. The correct behaviour in this case would be simply to remove the cleanup - there is a catchall and so the cleanup is completely redundant. In the more general case, if we have multiple rethrow blocks with different mechanisms then we should be wrapping them in a branch. The finally.rethrow block in this case should look like something like this:
finally.rethrow: ; preds = %lpad.i %3 = extractvalue { i8, i32 } %1, 0 %4 = load i32 @z, align 4, !tbaa !0 %inc = add nsw i32 %4, 1 store i32 %inc, i32 @z, align 4, !tbaa !0 %5 = extractvalue { i8, i32 } %1, 1 switch i32 %5, label %unreachable [ i32 0, label %cleanup.cont i32 1, label %rethrow ]
rethrow: call void @objc_exception_throw(i8* %3) unreachable
cleanup.cont: resume { i8*, i32 } %1