Closed Dan12 closed 5 years ago
I can confirm that this is also failing on my (Linux) build. Before merging I just want to double check and see why the built-in LLVM resume isn't working properly.
Yeah, I would also be curious to see why that is because according to the LLVM user guide it should just work. When I disassembled, it seemed like it was just falling through to the next basic block or just returning.
After spending some time looking at this, it seems that our exception implementation is a little confused and half uses LLVM features properly and half doesn't. I suspect that since we're only partially implementing this API that is the reason the resume doesn't work. (e.g. we aren't properly using multi exception type indexing and have worked around it with weird 'rethrow' semantics.)
If you're going to be working on exception implementation in the future, feel free to rethink how it is implemented and properly implement the LLVM-style Try-Catch-Finally. (https://releases.llvm.org/5.0.0/docs/ExceptionHandling.html)
From my conversations with Daniel, it sounds like the LLVM exception support is actually quite expensive. It may actually be better to figure out some other approach entirely, though some fighting against LLVM may be entailed. Maybe there's a paper in figuring out how LLVM should have supported exceptions...
That is also reasonable - either way, we should probably not be in this "half-in" / "half-out" state that we're in right now.
Sorry to barge in on this, but here are some thoughts off the top of my head in case any of them are useful!
(1) The reason the exception implementation does a 'rethrow' instead of 'resume' in some cases is because: in C++, if an exception is thrown while another is already in flight (i.e., if an exception is thrown from a destructor participating in a stack unwind), then the program will simply abort(!). This is in stark contrast to Java, where finally-blocks---which are the analogue to C++ destructors---are allowed to throw new exceptions that replace the previous one in flight. Unfortunately, LLVM implements the C++ behavior. So, when there is a finally-block, we have to catch the exception outright, run the finally-block, then re-throw. JLangTryExt.java:168 has a comment explaining this.
(2) Aside from (1), is there any other sense in which we deviate from proper LLVM exception handling? I'm not sure what you mean by "multi exception type indexing," but I remember being pretty confident in our conformance to the LLVM exception API.
(3) Please feel free to post the .ll file corresponding to the failing test case here (before the fix). I might still remember enough to spot something.
(4) Just for my own curiosity: why do you consider LLVM exceptions to be expensive? They are zero-cost in the usual sense for the no-exception path. I admit I don't know too much about the costs incurred when an exception is actually thrown, but I would guess it's implemented in the same way that most compilers do it (stack walking with exception side-tables).
Also, hi everyone! :)
Here is the relevant code snippet. This would print "Caught runtime exception" and then "Should not Print". I can confirm that the code does enter the catch.next
label but I'm not sure where it goes afterward.
Also, we compared popping a stack frame with exceptions to simply returning and found it to be ~100 times slower (so a few hundred assembly instructions per exception thrown).
continue7: ; preds = %load.class6, %continue2
invoke void @Polyglot_ExceptionsNested_catchesRuntimeExn__()
to label %invoke.cont unwind label %lpad.catch, !dbg !19
lpad.catch: ; preds = %invoke.cont17, %continue12, %load.class11, %continue7
%lpad.catch.res = landingpad { i8*, i32 }
catch i8* @Polyglot_java_lang_Exception_class_id, !dbg !15
%exn = extractvalue { i8*, i32 } %lpad.catch.res, 0, !dbg !15
%sel = extractvalue { i8*, i32 } %lpad.catch.res, 1, !dbg !15
%catch.matches = icmp eq i32 %sel, 1, !dbg !15
br i1 %catch.matches, label %catch.java.lang.Exception, label %catch.next, !dbg !15
try.end: ; preds = %continue22, %invoke.cont19
ret void, !dbg !14
invoke.cont: ; preds = %continue7
%class13 = load %class.java_lang_Class*, %class.java_lang_Class** @Polyglot_java_lang_System_class, !dbg !20
%class.null14 = icmp eq %class.java_lang_Class* %class13, null, !dbg !20
br i1 %class.null14, label %load.class11, label %continue12, !dbg !20
load.class11: ; preds = %invoke.cont
%call16 = invoke %class.java_lang_Class* @Polyglot_java_lang_System_load_class()
to label %invoke.cont15 unwind label %lpad.catch, !dbg !20
continue12: ; preds = %invoke.cont15, %invoke.cont
%load.out = load %class.java_io_PrintStream*, %class.java_io_PrintStream** @Polyglot_java_lang_System_out, !dbg !20
invoke void @InternStringLit({ %cdv_ty.java_lang_String*, i8*, %class.jlang_runtime_Array*, i16 }* @_string_lit_83_104_111_117_108_100_32_110_111_116_32_112_114_105_110_116_)
to label %invoke.cont17 unwind label %lpad.catch, !dbg !21
invoke.cont15: ; preds = %load.class11
br label %continue12, !dbg !20
invoke.cont17: ; preds = %continue12
%gep = getelementptr %class.java_io_PrintStream, %class.java_io_PrintStream* %load.out, i32 0, i32 0, !dbg !20
%load.dv = load %cdv_ty.java_io_PrintStream*, %cdv_ty.java_io_PrintStream** %gep, !dbg !20
%gep18 = getelementptr %cdv_ty.java_io_PrintStream, %cdv_ty.java_io_PrintStream* %load.dv, i32 0, i32 3, i32 44, !dbg !20
%load.dv.method = load void (%class.java_io_PrintStream*, %class.java_lang_String*)*, void (%class.java_io_PrintStream*, %class.java_lang_String*)** %gep18, !dbg !20
invoke void %load.dv.method(%class.java_io_PrintStream* %load.out, %class.java_lang_String* bitcast ({ %cdv_ty.java_lang_String*, i8*, %class.jlang_runtime_Array*, i16 }* @_string_lit_83_104_111_117_108_100_32_110_111_116_32_112_114_105_110_116_ to %class.java_lang_String*))
to label %invoke.cont19 unwind label %lpad.catch, !dbg !20
invoke.cont19: ; preds = %invoke.cont17
br label %try.end, !dbg !15
catch.java.lang.Exception: ; preds = %lpad.catch
call void @llvm.dbg.declare(metadata %class.java_lang_Exception** %e, metadata !22, metadata !17), !dbg !24
%call20 = call i8* @extractJavaExceptionObject(i8* %exn), !dbg !15
%cast = bitcast i8* %call20 to %class.java_lang_Exception*, !dbg !15
store %class.java_lang_Exception* %cast, %class.java_lang_Exception** %e, !dbg !15
%class23 = load %class.java_lang_Class*, %class.java_lang_Class** @Polyglot_java_lang_System_class, !dbg !25
%class.null24 = icmp eq %class.java_lang_Class* %class23, null, !dbg !25
br i1 %class.null24, label %load.class21, label %continue22, !dbg !25
catch.next: ; preds = %lpad.catch
resume { i8*, i32 } %lpad.catch.res
load.class21: ; preds = %catch.java.lang.Exception
%call25 = call %class.java_lang_Class* @Polyglot_java_lang_System_load_class(), !dbg !25
br label %continue22, !dbg !25
Hm, can you post the whole .ll file?
IICU it should not even be printing "Caught runtime exception," which means the bug is in the translation for catchesRuntimeExn()
, not main()
. (Or both, who knows.)
Was able to reproduce locally, and I too am really confused about what's going on in that test case. So, I'll stop meddling for now :)
Thanks for your input Matt! You definitely have a lot more familiarity with the code base and the plan for how this was supposed to work so your thoughts are very helpful. Feel free to chime in whenever you want!
As far as the 'multi-catch' goes -> I meant that we weren't using the landingpad <type> ...
support. All of our exceptions are "Java Exceptions" and then we insert extra code to check the class type. I thought that checking code did something extra and weird (with rethrowing excepitons), but upon re-reading some of the generated *.ll files it looks reasonable to me.
I'm not sure if you considered trying to use the LLVM support for multi-catch and decided against it or not - definitely works the way it is but might be cleaner / more efficient to use it if there isn't some complication that I'm missing.
Even if you use multi-catch, the landing pad will still be entered in the 2nd cleanup phase and you need to resume exception propagation (the way it signals cleanup only is to set the select part of the exception struct to 0). When I was investigating why resume didn't work, I looked at the LLVM exception example: https://github.com/llvm-mirror/llvm/blob/master/examples/ExceptionDemo/ExceptionDemo.cpp
In there I am pretty sure they explicitly make the call to _Unwind_Resume instead of just generating a resume instruction but I don't know why.
Ah, I think there's some confusion about 'multi-catch', because we actually do use landingpad <type> ...
; here's an example emitted from the Exceptions.java test case:
lpad.catch:
%lpad.catch.res = landingpad { i8*, i32 }
catch i8* @Polyglot_java_lang_Exception_class_id
catch i8* @Polyglot_java_lang_Error_class_id, !dbg !129
Note, runtime/native/exception.cpp
line 306 is where we make the call to instanceof
to check whether the exception currently in-flight matches some catch clause declared in a landingpad
instruction in a given frame. I.e., the exception class type is checked by the stack walker, not by the generated LLVM! The 'extra code' you saw after the landingpad
in the generated LLVM is actually comparing a simple integer index returned by the stack walker indicating which catch clause it found to be matched.
(I wish LLVM would just let you declare a label for each catch clause and have the stack walker jump directly there instead of handing you back an index---but, you will see the same pattern in LLVM generated from C++ code. It's very strange.)
Anyway, our exception implementation should do the right thing: if none of the catch clauses in a try-catch block match an in-flight exception (and if there is no finally-block), then the stack walker will know it (due to our instanceof
checks in exception.cpp
) and should skip right over that frame entirely. It should not jump to the landingpad
corresponding to that try-catch block, and thus it should not even get to the resume
there at all.
(Now, the next good question to ask is: "If the stack-walker only stops at try-catch blocks for which a catch clause matches, then why do we ever need to emit a resume
instruction? After all, if a catch clause does match, then the exception is no longer in-flight and should not need to be resumed." The answer is: I have no idea. But I recall reading somewhere that the catch-clause-index-checking code after a landing pad should be prepared to handle a situation in which the stack walker jumps there even when no catch clauses have been matched. In fact, you will see the same thing in the LLVM IR for C++: it always has a backup resume
branch just in case. My best guess is that this might be related to being robust against optimizations which move your landing pads around, but I'm not sure.)
Of course, the failing test case here completely contradicts what I said above about "skipping right over frames that don't have a matching catch clause," and that's the part I'm quite confused about. When debugging I was seeing some completely inexplicable control flow and had to give to give up eventually :)
By the way, I should mention that I've never really understood the code in runtime/native/exception.cpp
---I think we got that code from an LLVM tutorial/demo and then tweaked it slightly to do our instanceof
checks. I wonder whether the exception bug might lie there rather than in the generated IR. After all, that's the code that manages the stack walker's search phase, cleanup phase, etc., and determines which landing pads to stop at. It also does things like directly set the instruction pointer, which might explain some of the "inexplicable control flow" I was seeing...
(Also, regarding performance, note that exception.cpp
probably plays a significant role in the performance of exception handling. It's possible that the C++ stack walker is way more efficient than ours, for example, or that the instanceof
checks are the bottleneck in our stack walker.)
Anyway, good luck! I'm all for re-writing or tweaking the exception implementation if it fixes things.
I think we still have to enter the landing pad because we may have finally blocks that need to do some cleanup as the stack is unwound. Also, I think our exception.cpp is based on the C++ stack walker code so I'm not sure if we can look there for performance gains.
The way I understand it is that the itanium exception handling mechanism does 2 passes: a search and a cleanup pass. LLVM even admits that this isn't necessary even for C++ exceptions. The cleanup pass needs to enter all landing pads as it unwinds stack frames and it passes a pointer to the exception that is being propegated and a selector. Typically, the selector corresponds to the catch clause which is supposed to handle the exception, so you should be able to do a switch statement at the landing pad and jump to the correct catch handler. The selector can also be 0, which means that this landing pad should do cleanup only (so execute the finally clause) and then resume the exception with a call to _Unwind_Resume
. I suspect that since LLVM want's to be agnostic to which unwinding library you use, it can't actually generate the call to _Unwind_Resume
and implements some wierd default translation for resume
.
When there is a finally block, we do emit a catch-all catch i8* null
clause for the landingpad
to ensure that we always stop at that landing pad. (We use catch i8* null
rather than a cleanup
clause, because LLVM does not allow cleanup
blocks to throw new exceptions---which we must support for Java finally-blocks.)
Perhaps you are right about stopping at every landing pad in the cleanup phase, though. The documentation for the cleanup
clause of landingpad
says, "This clause means that the landingpad
should always be entered." So I had assumed that if there is no cleanup
clause (and no catch-all catch i8* null
clause), then the stack walker would just skip over it. But I'm realizing that I never checked to verify that! And what you're saying sounds very plausible.
Perhaps the direct call to _Unwind_Resume
really is the best solution here.
Ok, the rethrow makes sense. Yeah, it looks like the code was written assuming that resume
would essentially call _Unwind_Resume
but it turns out it doesn't actually do that so it does seem like the right thing to do is just the direct call.
It seems like whatever compiles the .ll files ignores the resume block and doesn't insert the proper call the unwind.h library. This makes the explicit call to the correct function.