oracle / graal

GraalVM compiles Java applications into native executables that start instantly, scale fast, and use fewer compute resources 🚀
https://www.graalvm.org
Other
20.27k stars 1.63k forks source link

The Truffle PE deoptimises on an instanceof interface where it doesn't know the class, but doesn't report a performance warning #4030

Open chrisseaton opened 2 years ago

chrisseaton commented 2 years ago

If you do instanceof against an interface with multiple implementations, and the class of the object isn't known, the Truffle PE compiled a deopt runtime constraint. A useful error message is reported at the point of deoptimisation. But no performance warning is printed at the point of compilation, which would be more useful.

Found with @wildmaples.

chrisseaton commented 2 years ago

Adding some more detail.

Not sure what the point of installing this is?

Screenshot 2021-11-23 at 19 59 36

The deopt node has no source info. There's no compilation warnings.

I don't seem to get told when it deoptimises. I do see a deoptimize for #foo but it's ahead of the compilation finishing.

[engine] opt queued id=1398  Object#foo                                         |Tier 2|Count/Thres         62/      625|Queue: Size    0 Change +1  Load  0.06 Time     0us|Timestamp 26826371807428|Src test.rb:1
[engine] opt start  id=1398  Object#foo                                         |Tier 2|Priority        62|Rate 0.000000|Queue: Size    0 Change +0  Load  0.06 Time     0us|Timestamp 26826492098434|Src test.rb:1
Deoptimizing thread 0x00007f931f17c800
DEOPT UNPACKING thread 0x00007f931f17c800 vframeArray 0x00007f931f965220 mode 0
     {method} {0x000000013748a008} 'initializeForMethodStart' '(Ljdk/vm/ci/meta/Assumptions;ZLorg/graalvm/compiler/nodes/graphbuilderconf/GraphBuilderConfiguration$Plugins;)V' in 'org/graalvm/compiler/java/FrameStateBuilder' - invokestatic @ bci 297 sp = 0x0000700001dc77b8
Uncommon trap   bci=0 pc=0x0000000117dae92a, relative_pc=0x000000000000066a, method=org.graalvm.collections.EconomicMapImpl.putHashEntry(Ljava/lang/Object;IZ)V, debug_id=0
Uncommon trap occurred in org.graalvm.collections.EconomicMapImpl::putHashEntry compiler=jvmci compile_id=48054 (@0x0000000117dae92a) thread=38407 reason=null_assert_or_unreached0 action=reinterpret unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f92df09a800 vframeArray 0x00007f92df09d820 mode 3
     {method} {0x00000001024a4cf8} 'putHashEntry' '(Ljava/lang/Object;IZ)V' in 'org/graalvm/collections/EconomicMapImpl' - aload_0 @ bci 0 sp = 0x0000700003306708
Uncommon trap   bci=0 pc=0x0000000117d96b45, relative_pc=0x00000000000001c5, method=org.graalvm.collections.EconomicMapImpl.getHashIndex(Ljava/lang/Object;)I, debug_id=0
Uncommon trap occurred in org.graalvm.collections.EconomicMapImpl::getHashIndex compiler=jvmci compile_id=47839 (@0x0000000117d96b45) thread=38407 reason=null_assert_or_unreached0 action=reinterpret unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f92df09a800 vframeArray 0x00007f92cf09a220 mode 3
     {method} {0x00000001024a4180} 'getHashIndex' '(Ljava/lang/Object;)I' in 'org/graalvm/collections/EconomicMapImpl' - aload_0 @ bci 0 sp = 0x0000700003306690
Uncommon trap   bci=0 pc=0x0000000117ddbf92, relative_pc=0x0000000000000472, method=org.graalvm.collections.EconomicMapImpl.findHash(Ljava/lang/Object;)I, debug_id=0
Uncommon trap occurred in org.graalvm.collections.EconomicMapImpl::findHash compiler=jvmci compile_id=49159 (@0x0000000117ddbf92) thread=38407 reason=null_assert_or_unreached0 action=reinterpret unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f92df09a800 vframeArray 0x00007f92df09d820 mode 3
     {method} {0x00000001024a37d0} 'findHash' '(Ljava/lang/Object;)I' in 'org/graalvm/collections/EconomicMapImpl' - aload_0 @ bci 0 sp = 0x0000700003306790
Uncommon trap   bci=66 pc=0x0000000117ef72eb, relative_pc=0x00000000000003cb, method=java.lang.reflect.Constructor.newInstance([Ljava/lang/Object;)Ljava/lang/Object;, debug_id=0
Uncommon trap occurred in java.lang.reflect.Constructor::newInstance compiler=jvmci compile_id=55311 (@0x0000000117ef72eb) thread=38407 reason=constraint action=none unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f92df09a800 vframeArray 0x00007f9318112c20 mode 3
     {method} {0x00000008002dbb58} 'newInstance' '([Ljava/lang/Object;)Ljava/lang/Object;' in 'java/lang/reflect/Constructor' - aload_2 @ bci 66 sp = 0x00007000033020c8
Uncommon trap   bci=0 pc=0x0000000117cc85b0, relative_pc=0x0000000000000070, method=java.util.PriorityQueue.poll()Ljava/lang/Object;, debug_id=0
Uncommon trap occurred in java.util.PriorityQueue::poll compiler=jvmci compile_id=35591 (@0x0000000117cc85b0) thread=38407 reason=null_check action=reinterpret unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f92df09a800 vframeArray 0x00007f930f0c5620 mode 3
     {method} {0x000000013771c1f8} 'poll' '()Ljava/lang/Object;' in 'java/util/PriorityQueue' - aload_0 @ bci 0 sp = 0x0000700003306c10
Uncommon trap   bci=0 pc=0x000000011813960b, relative_pc=0x000000000000006b, method=org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot([Ljava/lang/Object;)Ljava/lang/Object;, debug_id=0
Uncommon trap occurred in org.graalvm.compiler.truffle.runtime.OptimizedCallTarget::profiledPERoot compiler=jvmci compile_id=66947 (JVMCI: installed code name=Object#foo#2)  (@0x000000011813960b) thread=6147 reason=constraint action=reinterpret unloaded_class_index=-1 debug_id=0
DEOPT UNPACKING thread 0x00007f930f013800 vframeArray 0x00007f92bf009620 mode 3
     {method} {0x00000001027f8100} 'profiledPERoot' '([Ljava/lang/Object;)Ljava/lang/Object;' in 'org/graalvm/compiler/truffle/runtime/OptimizedCallTarget' - aload_1 @ bci 0 sp = 0x000070000119e418
[engine] opt done   id=1398  Object#foo                                         |Tier 2|Time  1507(1418+89  )ms|AST   23|Inlined   0Y   0N|IR      5/     5|CodeSize     120|Timestamp 26828087267076|Src test.rb:1
[engine] opt queued id=1398  Object#foo                                         |Tier 2|Count/Thres      18119/      625|Queue: Size    0 Change +1  Load  0.06 Time     0us|Timestamp 26828101387801|Src test.rb:1
[engine] opt start  id=1398  Object#foo                                         |Tier 2|Priority     18119|Rate 0.000000|Queue: Size    0 Change +0  Load  0.06 Time     0us|Timestamp 26828103991532|Src test.rb:1
[engine] opt done   id=1398  Object#foo                                         |Tier 2|Time   273( 251+22  )ms|AST   23|Inlined   0Y   0N|IR     44/    98|CodeSize     428|Timestamp 26828378459766|Src test.rb:1

I also don't get why it's installed twice? In general it's like it's compiling with a never-loaded class, but how does it get to compiling something with a class that isn't loaded yet?

[engine] opt done   id=1398  Object#foo                                         |Tier 2|Time  3084(2791+292 )ms|AST   23|Inlined   0Y   0N|IR      5/     5|CodeSize     120|Timestamp 26387203376698|Src test.rb:1
[engine] opt done   id=1398  Object#foo                                         |Tier 2|Time   637( 259+377 )ms|AST   23|Inlined   0Y   0N|IR     44/    98|CodeSize     428|Timestamp 26388175302458|Src test.rb:1

(To reproduce, build TruffleRuby 449289e076a28fa26319f0a421bb4f903e1efa36 then 3a78df6a996305a84fd8c4af83895931aeb34779, then run this program. TruffleRuby is broken on this commit, but I think that doesn't matter to what Truffle decides to do with the code it gets.)

def foo(a, b:)
  a + b
end

loop do
  foo(rand(100), b: rand(100))
end
chumer commented 2 years ago

Using instanceof with interfaces is generally discouraged for Truffle. It was one of the reasons why we introduced Truffle Libraries in the first place.

Seeing the deopt for a virtual interface type check is unexpected. My guess is that the deopt comes from a different location of the ruby implementation.

Not sure what the point of installing this is?

None really, we just do not special case it yet. However if that happens reliably it typically means there is a bug in the language implementation, so I didn't consider this worth fixing so far.

chrisseaton commented 2 years ago

Using instanceof with interfaces is generally discouraged for Truffle.

This was possibly a red herring. I couldn't reproduce with a simple instanceof when I tried - hence the convoluted reproduction instructions.

The deopt is attributed to the Truffle-compiled code, which doesn't make sense to me of how you could PE with unloaded classes.

Maybe I'll sit on this until I can reproduce more reasonably and then show you.

chumer commented 2 years ago

If you can reproduce this somehow with a simple RootNode that is certainly appreciated and saves me time as I need to do that anyway for the regression test.

eregon commented 2 years ago

Did you enable performance warnings? They are not shown by default. "virtual instanceof" like that should be reported as performance warnings. --engine.TracePerformanceWarnings=all is the way to report them, and see --check-compilation in jt if you want to abort if there is a performance warning.

eregon commented 2 years ago

Also to trace transfers/deopts, the best is to do it on SVM as documented in https://github.com/oracle/graal/blob/master/truffle/docs/Optimizing.md#debugging-deoptimizations. That shall give a clear stacktrace of where the deopt happens.

chumer commented 2 years ago

Did you see this issue native or JVM? On JVM I cannot reproduce.

[engine] opt done   id=1377  Object#foo                                         |Tier 1|Time   192( 188+5   )ms|AST   23|Inlined   0Y   0N|IR     64/    87|CodeSize     466|Timestamp 302509018474717|Src test.rb:1
...
[engine] opt done   id=1377  Object#foo                                         |Tier 2|Time   120( 113+6   )ms|AST   23|Inlined   0Y   0N|IR     44/    98|CodeSize     422|Timestamp 302509694101263|Src test.rb:1

As expected two compiles, once with first tier, once with second tier. I used latest graal from master, CE.

chrisseaton commented 2 years ago

I can only reproduce with this

(To reproduce, build TruffleRuby 449289e076a28fa26319f0a421bb4f903e1efa36 then 3a78df6a996305a84fd8c4af83895931aeb34779, then run this program. TruffleRuby is broken on this commit, but I think that doesn't matter to what Truffle decides to do with the code it gets.)

I'll try to give a better reproducer.

eregon commented 2 years ago

I tried to repro and I get many:

[engine] opt failed id=51    Module#alias_method                                |Tier 1|Time   257( 248+9   )ms|Reason: org.graalvm.compiler.code.SourceStackTraceBailoutException$1: Object of type Lorg/graalvm/compiler/truffle/runtime/FrameWithoutBoxing; should not be materialized (must not let virtual object escape at node 117|EndNode):|Timestamp 21288849648754|Src (core):1

So compilation fails because a VirtualFrame is leaked, that's expected to fail.

chrisseaton commented 2 years ago

Yes run with --engine.CompileOnly=foo. As I say TruffleRuby is broken on this commit - the point is about how it handles that broken code.

eregon commented 2 years ago

With that I get:

[engine] opt done   id=1378  Object#foo                                         |Tier 1|Time   134( 121+13  )ms|AST   23|Inlined   0Y   0N|IR     64/    87|CodeSize     457|Timestamp 21608410784615|Src deopt.rb:1
[engine] opt done   id=1378  Object#foo                                         |Tier 1|Time    66(  50+16  )ms|AST   23|Inlined   0Y   0N|IR    115/   194|CodeSize     746|Timestamp 21608481123841|Src deopt.rb:1
[engine] opt done   id=1378  Object#foo                                         |Tier 2|Time    48(  43+6   )ms|AST   23|Inlined   0Y   0N|IR     44/    98|CodeSize     391|Timestamp 21608531241153|Src deopt.rb:1

So two Tier 1 compilations. I'll check on native to see if that's easier to pinpoint the deopt.

eregon commented 2 years ago

On native (needs a older mx to build, I used 5.316.1), with the imported graal (6178bf1efa51876102fa3873f24ed965344a7208):

$ jt -u native ruby --trace --engine.CompileOnly=foo --engine.TraceTransferToInterpreter --engine.TraceDeoptimizeFrame --fg deopt.rb
[engine] opt done   id=1360  Object#foo                                         |Tier 1|Time     7(   6+1   )ms|AST   23|Inlined   0Y   0N|IR     64/   112|CodeSize     514|Timestamp 23422692829122|Src deopt.rb:1
[Deoptimization initiated
    name: Object#foo (SubstrateCompilation-3727)
    sp: 0x7ffd9d8b4770  ip: 0x7f2853e53201
    reason: RuntimeConstraint  action: InvalidateReprofile
    debugId: 6  speculation: com.oracle.svm.graal.isolated.EncodedSpeculationReason@e1782
    stack trace that triggered deoptimization:
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.firstTierCall(OptimizedCallTarget.java:640)
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:620)
[Deoptimization of frame
    name: Object#foo (SubstrateCompilation-3727)
    sp: 0x00007ffd9d8b4770  ip: 0x00007f2853e53201
    stack trace where execution continues:
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.isSubmittedForCompilation(OptimizedCallTarget.java:787) bci 4  return address 0x0000000003d1ae7b
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.firstTierCall(OptimizedCallTarget.java:638) bci 54 duringCall  return address 0x0000000003d1a911
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:620) bci 9 duringCall  return address 0x0000000003d1b64a
]
[engine] opt inval. id=1360  Object#foo                                                                                                                                     |Timestamp 23422693226508|Src deopt.rb:1|Reason null
]
[engine] opt done   id=1360  Object#foo                                         |Tier 1|Time     6(   3+2   )ms|AST   23|Inlined   0Y   0N|IR    113/   219|CodeSize     817|Timestamp 23422699106794|Src deopt.rb:1
[engine] opt done   id=1360  Object#foo                                         |Tier 2|Time     4(   3+1   )ms|AST   23|Inlined   0Y   0N|IR     42/   123|CodeSize     470|Timestamp 23422748710528|Src deopt.rb:1
    stack trace that triggered deoptimization:
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.firstTierCall(OptimizedCallTarget.java:640)
        at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:620)

That's still a weird stacktrace though.

eregon commented 2 years ago

With a debug JDK, I see

$ jt -u jvm-ce ruby --trace --engine.CompileOnly=foo --engine.TraceTransferToInterpreter --engine.TraceDeoptimizeFrame --vm.XX:+UnlockDiagnosticVMOptions --vm.XX:+DebugNonSafepoints --vm.XX:+TraceDeoptimization --fg --igv-full deopt.rb
...
Uncommon trap   bci=4 pc=0x00007f4e82ba051c, relative_pc=0x000000000000015c, method=org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.isSubmittedForCompilation()Z, debug_id=2114
Uncommon trap occurred in org.graalvm.compiler.truffle.runtime.OptimizedCallTarget::profiledPERoot compiler=jvmci compile_id=39743 (JVMCI: installed code name=Object#foo#1)  (@0x00007f4e82ba051c) thread=176170 reason=constraint action=reinterpret unloaded_class_index=-1 debug_id=2114
DEOPT UNPACKING thread 0x00007f4e94033800 vframeArray 0x00007f4e951dc490 mode 3
     {method} {0x00007f4e5d01db60} 'profiledPERoot' '([Ljava/lang/Object;)Ljava/lang/Object;' in 'org/graalvm/compiler/truffle/runtime/OptimizedCallTarget' - invokevirtual @ bci 9 sp = 0x00007f4e9a1ecca0
     {method} {0x00007f4e5d01dca0} 'firstTierCall' '()Z' in 'org/graalvm/compiler/truffle/runtime/OptimizedCallTarget' - invokevirtual @ bci 54 sp = 0x00007f4e9a1ecc40
     {method} {0x00007f4e5d01ec00} 'isSubmittedForCompilation' '()Z' in 'org/graalvm/compiler/truffle/runtime/OptimizedCallTarget' - ifnull @ bci 4 sp = 0x00007f4e9a1ecbe0

And looking for that node in IGV it's a Guard of a <, with nodeSourcePosition:

Java:
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.firstTierCall (OptimizedCallTarget.java:640) [bci:82]
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot (OptimizedCallTarget.java:620) [bci:9]

code: && firstTierLoopCallCount >= scaledThreshold(engine.callAndLoopThresholdInFirstTier)) {

But looking at the IGV graph further, it seems clear, the path which doesn't go to lastTierCompile() ends up in a deopt, and so is pruned as dead code, transforming the compilation threshold into a guard+deopt. And when the threshold is hit then that guard no longer holds. Screenshot from 2021-11-25 18-00-32

Java:
    org.truffleruby.language.arguments.ReadDescriptorArgumentNodeGen.execute (ReadDescriptorArgumentNodeGen.java:78) [bci:85]
    org.truffleruby.language.RubyNode.doExecuteVoid (RubyNode.java:63) [bci:2]
    org.truffleruby.language.control.SequenceNode.execute (SequenceNode.java:33) [bci:20]
    org.truffleruby.language.RubyMethodRootNode.execute (RubyMethodRootNode.java:56) [bci:9]
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode (OptimizedCallTarget.java:653) [bci:5]
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot (OptimizedCallTarget.java:625) [bci:40]

ruby:
    (deopt.rb:1)

So it's annoying that we don't get a better stacktrace, but OTOH it seems fairly obvious what happens with the IGV graph. The compiler needs to merge transfer+invalidates deopts for optimizations, and so I guess it cannot always report the original stacktrace. The bug is in having that unconditional deopt in the first place. likely due to a bug in the node.

chrisseaton commented 2 years ago

Do you think we should also report unconditional depots as a performance warning? And not install them?

eregon commented 2 years ago

With --vm.Dgraal.Dump=Truffle:4 it's possible to dig further, notably I see the if (value != ReadDescriptorArgumentNode.MISSING) { can be proven to be always true, because value is typed as a non-null Integer. And the other branch is a call to argumentErrorMissingKeyword().

    org.truffleruby.language.arguments.ReadDescriptorArgumentNode$CheckKeywordArgumentNode.execute (ReadDescriptorArgumentNode.java:947) [bci:15]
    org.truffleruby.language.arguments.ReadDescriptorArgumentNode.unloadDefaultArgs (ReadDescriptorArgumentNode.java:323) [bci:19]
    org.truffleruby.language.arguments.ReadDescriptorArgumentNode.cached (ReadDescriptorArgumentNode.java:406) [bci:192]
    org.truffleruby.language.arguments.ReadDescriptorArgumentNodeGen.execute (ReadDescriptorArgumentNodeGen.java:79) [bci:102]
    org.truffleruby.language.RubyNode.doExecuteVoid (RubyNode.java:63) [bci:2]
    org.truffleruby.language.control.SequenceNode.execute (SequenceNode.java:33) [bci:20]
    org.truffleruby.language.RubyMethodRootNode.execute (RubyMethodRootNode.java:56) [bci:9]
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode (OptimizedCallTarget.java:653) [bci:5]
    org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot (OptimizedCallTarget.java:625) [bci:40]

Maybe we could report a compilation of a method that doesn't return and doesn't unwind, i.e. it can only terminate in a deopt. That case however might be valid e.g., for OSR compilation, which can typically not compile what's after the loop without a deopt since it might never have executed it yet.