LICM incorrectly hoists load because Instruction::mayThrow does not correctly handle invokes

Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

LICM incorrectly hoists load because Instruction::mayThrow does not correctly handle invokes #24184

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago


Bugzilla Link	PR24185
Status	REOPENED
Importance	P normal
Reported by	Sanjoy Das (sanjoy@playingwithpointers.com)
Reported on	2015-07-20 01:58:47 -0700
Last modified on	2015-07-22 13:06:35 -0700
Version	trunk
Hardware	PC All
CC	david.majnemer@gmail.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, rnk@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Running -licm on the following IR hoists the load of %m outside the loop. This is incorrect because it is possible that @f throws something that does not unwind to %right (because the exception's type does not match @_ZTIi). Such a throw will exit the frame. Thus, the load from %ptr is not executed on all paths through the loop and cannot be safely speculated.

As far as I can tell, the bug is in Instruction::mayThrow -- LLVM cannot certify that an invoke does not throw without analyzing all of its landingpads.

target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.10.0"

@_ZTIi = external constant i8*

declare i32 @__gxx_personality_v0(...)

declare i32 @f(i32) readnone

define i32 @g(i32 %ptr) personality i8 bitcast (i32 (...) @__gxx_personality_v0 to i8) { entry: br label %loop

loop: %acc = phi i32 [ 0, %entry ], [ %acc.next, %merge ] %x = invoke i32 @f(i32 %acc) to label %left unwind label %right

left: br label %merge

right: %v = landingpad { i8, i32 } catch i8 bitcast (i8* @_ZTIi to i8) br label %merge

merge: %vv = phi i32 [ %x, %left ], [ 0, %right] %m = load i32, i32* %ptr %acc.next = add i32 %m, %acc %c = icmp eq i32 %vv, 0 br i1 %c, label %loop, label %exit

exit: ret i32 %acc }

Quuxplusone commented 9 years ago

I think that all mayThrow ensures is that the specific instruction doesn't immediately unwind. That invoke transfers control to instructions which may throw isn't conceptually different from a br which does the same, no?

Quuxplusone commented 9 years ago

(In reply to comment #1)

I think that all mayThrow ensures is that the specific instruction doesn't immediately unwind. That invoke transfers control to instructions which may throw isn't conceptually different from a br which does the same, no?

I was under the impression that an invoke does not always unwind to its landingpad, but may "skip" frames if the catch clause does not match the type of the thrown exception. Given that that isn't the case, and invokes always unwind to their landingpad if an exception is thrown, this isn't a bug.

Quuxplusone commented 9 years ago

After reading http://llvm.org/docs/ExceptionHandling.html, I think this bug
merits re-opening.  The doc says:

 * catch <type> @ExcType

  * This clause means that the landingpad block should be entered if the exception being thrown is of type @ExcType or a subtype of @ExcType. For C++, @ExcType is a pointer to the std::type_info object (an RTTI object) representing the C++ exception type.
  * If @ExcType is null, any exception matches, so the landingpad should always be entered. This is used for C++ catch-all blocks (“catch (...)”).

 * When this clause is matched, the selector value will be equal to the value returned by “@llvm.eh.typeid.for(i8* @ExcType)”. This will always be a positive value.

The doc does not say that landing pads execute for every thrown exception.  I
could not find anything that says landingpads execute for any exception in the
Itanium EH ABI either.

Quuxplusone commented 9 years ago

http://llvm.org/docs/ExceptionHandling.html#restrictions:

"The inliner will update the landingpad instruction for the inlined landing pad to include the fact that B is also caught. If that landing pad assumes that it will only be entered to catch an A, it’s in for a rude awakening. Consequently, landing pads must test for the selector results they understand and then resume exception propagation with the resume instruction if none of the conditions match."

To me, this says the LP must lead to a resume.

Quuxplusone commented 9 years ago

(In reply to comment #4)
> http://llvm.org/docs/ExceptionHandling.html#restrictions:
>
> "The inliner will update the landingpad instruction for the inlined landing
> pad to include the fact that B is also caught. If that landing pad assumes
> that it will only be entered to catch an A, it’s in for a rude awakening.
> Consequently, landing pads must test for the selector results they
> understand and then resume exception propagation with the resume instruction
> if none of the conditions match."
>
> To me, this says the LP must lead to a resume.

I don't disagree with this, but I think this is subtly different from
the issue I'm trying to highlight.

You're saying "a landingpad may be executed for a superset of the
filters / catch clauses it is originally generated with".  That isn't
the same as saying "a landingpad is executed for all exceptions that
are thrown from the invoke it is bound to".  The latter can be false
while the former is true.

The issue I'm trying to highlight is: "is an invoke *guaranteed* to
branch to either its normal successor or its landingpad?"

For instance, gcc/libstdc++ has code like (I have no idea if that is
what clang ends up using):

          // Null catch type is a catch-all handler; we can catch foreign
          // exceptions with this.  Otherwise we must match types.
          if (! catch_type
          || (throw_type
              && get_adjusted_ptr (catch_type, throw_type,
                       &thrown_ptr)))
        {
          saw_handler = true;
          break;
        }

with

      if (saw_handler)
    {
      handler_switch_value = ar_filter;
      found_type = found_handler;
    }
      else
    found_type = (saw_cleanup ? found_cleanup : found_nothing);

and

   if (found_type == found_nothing)
     CONTINUE_UNWINDING; == return _URC_CONTINUE_UNWIND

Quuxplusone commented 9 years ago

There's a couple issues here:

If the address of the load isn't dereferenceable, the hoist is wrong because an invoke can longjmp, exit, throw an exception unrecognized by the current EH personality, or unwind via some other means outside of the EH system.
The readnone attribute is supposed to model GCC's 'pure', which means it should probably imply nounwind. Throwing an exception or calling longjmp definitely "writes memory" in LLVM.

It sounds like LICM doesn't actually want a 'mayThrow' predicate, it really wants to know, "if this instruction executes, will all post-dominating instructions execute, assuming no UB?"

The current use of mayThrow is probably incorrect even for CallInsts in C, where everything is nounwind. Any external function call can end the current thread or program.

While it is true that landingpads must check the selector values and cannot rely on the catch clauses to do EH dispatch, I think that is a red herring here.

Quuxplusone commented 9 years ago

(In reply to comment #5)
> You're saying "a landingpad may be executed for a superset of the
> filters / catch clauses it is originally generated with".  That isn't
> the same as saying "a landingpad is executed for all exceptions that
> are thrown from the invoke it is bound to".  The latter can be false
> while the former is true.
>
> The issue I'm trying to highlight is: "is an invoke *guaranteed* to
> branch to either its normal successor or its landingpad?"

I think this gets at the core of the issue.

I would say that, given an invoke, there are three possibilities:
1. It returns
2. It unwinds to the landingpad. This may happen for a superset of the
landingpad clauses that were listed.
3. It does not return (aborts, exits, longjmps, pthread_cancels)

Only 1 and 2 are allowed to re-enter the execution context of the function in
question, so most non-hoisting optimizations don't need to be concerned with
possibility 3.

Quuxplusone commented 9 years ago

(In reply to comment #6)
> There's a couple issues here:
>
> - If the address of the load isn't dereferenceable, the hoist is wrong
> because an invoke can longjmp, exit, throw an exception unrecognized by the
> current EH personality, or unwind via some other means outside of the EH
> system.

Mostly agreed, though I have reservations about the last bit.

> - The readnone attribute is supposed to model GCC's 'pure', which means it
> should probably imply nounwind. Throwing an exception or calling longjmp
> definitely "writes memory" in LLVM.

In that case, -functionattrs is wrong.  Piping the following through
opt -functionattrs

define void @f(i8* %t) {
 entry:
  resume i8* %t
}

gives

; Function Attrs: readnone
define void @f(i8* %t) #0 {
entry:
  resume i8* %t
}

attributes #0 = { readnone }

> It sounds like LICM doesn't actually want a 'mayThrow' predicate, it really
> wants to know, "if this instruction executes, will all post-dominating
> instructions execute, assuming no UB?"
>
> The current use of mayThrow is probably incorrect even for CallInsts in C,
> where everything is nounwind. Any external function call can end the current
> thread or program.

I think LICM (and LLVM in general, frankly) relies too much on "may
read / write to arbitrary locations" as a hammer to prevent
problematic transforms.  In practice, @f will be read-write if it
throws, pthread_exits or longjmps.  Currently the spec does not forbid
readnone functions to unwind (and hence the -functionattrs behavior
above) but that is already inconsistent with DCE'ing readnone calls,
and formally forcing throwing functions to be read-write may be a way
to paper over this bug.

Quuxplusone commented 9 years ago

(In reply to comment #8)
> > - The readnone attribute is supposed to model GCC's 'pure', which means it
> > should probably imply nounwind. Throwing an exception or calling longjmp
> > definitely "writes memory" in LLVM.
>
> In that case, -functionattrs is wrong.  Piping the following through
> opt -functionattrs
>
> define void @f(i8* %t) {
>  entry:
>   resume i8* %t
> }
>
> gives
>
> ; Function Attrs: readnone
> define void @f(i8* %t) #0 {
> entry:
>   resume i8* %t
> }
>
> attributes #0 = { readnone }

This IR is kind of meaningless, though. The LangRef says:
  The ‘resume‘ instruction resumes propagation of an existing (in-flight) exception whose unwinding was interrupted with a landingpad instruction.

It doesn't explicitly say that the landingpad has to be in the same function,
but that's how it works in practice. If you call some other function which runs
'resume' or calls _Unwind_Resume, inlining your function through an invoke will
fail to link the resume instruction to the unwind destination. I think we could
pretty safely add a rule that resuming an exception from a different frame is
UB.

So, I would say that functionattrs is conservatively correct even if readnone
implies nounwind. It's just not smart enough to notice that the resume
instruction could be removed as unreachable.