Closed pshipton closed 9 months ago
@babsingh
Also occurred on jdk11
https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk11_j9_sanity.openjdk_ppc64le_linux_ojdk292_Personal_testList_2/1/ - jdk_lang_j9_0
https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk11_j9_sanity.openjdk_x86-64_mac_ojdk292_Personal_testList_2/1/ - jdk_lang_j9_0
I'll look into this.
I've reproduced locally on a ppc64le_linux
machine and profiled using Linux perf
found some peculiar results. Below are the flamegraphs for the (1) ojdk292
build; and (2) build with OpenJ9 MHs. (here is a box link: https://ibm.box.com/s/iqcoac0wxlmh7a0yshzzd74e339jdg1n with the flamegraph svg
s, if you want to interact with the figures, and the corresponding perf.data
files.) The test relies on random number generation so I've run both builds with the same seed.
(1) Runtime ~26 minutes, fails with Not enough time to continue execution. Interrupted.
https://github.com/ThanHenderson/openj9-openjdk-jdk8/blob/9a702f6d35a13affd72c1b996ca991ae40dac100/test/lib/jdk/test/lib/TimeLimitedRunner.java#L77-L78
(2) Runtime ~ 3 minutes, passes with test limit exceeded
https://github.com/ThanHenderson/openj9-openjdk-jdk8/blob/9a702f6d35a13affd72c1b996ca991ae40dac100/jdk/test/java/lang/invoke/MethodHandles/CatchExceptionTest.java#L264
I am also building a JDK17 executable to see what type of behaviour we have on this test with OpenJDK MHs in JDK17.
fyi @JasonFengJ9 @babsingh
It looks like the JIT isn't optimizing in the case of OJDK MHs.
perf diff ojdk-mh_perf.data oj9-mh_perf.data
?-Xint
(JIT disabled)?Do OJ9 MHs take the same time as OJDK MHs with -Xint (JIT disabled)?
Roughly; still a bit faster I think due to the custom MH interpreter. But it also fails with Not enough time to continue execution. Interrupted.
Can we easily identify the missing JIT optimizations using perf diff ojdk-mh_perf.data oj9-mh_perf.data?
Here is the diff: https://ibm.ent.box.com/file/1421730760649. A naive grep
shows:
❯ rg -i methodhandle perf-diff.log | rg libj9jit
+0.02% libj9jit29.so [.] TR_J9VMBase::lookupMethodHandleThunkArchetype
+0.02% libj9jit29.so [.] compileMethodHandleThunk
+0.01% libj9jit29.so [.] TR_MethodHandleTransformer::perform
+0.01% libj9jit29.so [.] J9::MethodHandleThunkDetails::isMethodHandleThunk
0.00% +0.01% libj9jit29.so [.] TR_MethodHandleTransformer::~TR_MethodHandleTransformer
and
❯ rg -i mh perf-diff.log | rg libj9jit
+0.13% libj9jit29.so [.] TR_MHJ2IThunkTable::getTerseSignature
+0.03% libj9jit29.so [.] TR_MHJ2IThunkTable::Node::get
+0.03% libj9jit29.so [.] TR_MHJ2IThunkTable::terseTypeChar
+0.01% libj9jit29.so [.] TR_MHJ2IThunkTable::getThunk
+0.01% libj9jit29.so [.] OMR::LocalCSE::removeFromHashTable
+0.01% libj9jit29.so [.] TR_MHJ2IThunkTable::findThunk
which isn't surprising because the compileMethodHandleThunk
path is only taking via a call to translateMethodHandle
in the OJ9 MH specific j2iInvokeExact
path:
https://github.com/eclipse-openj9/openj9/blob/b669217a6bfffc51e01fd48d364af4bf120e3c21/runtime/vm/BytecodeInterpreter.hpp#L1039-L1046
Here are the distinct optimizations done with the OJ9 MH build:
❯ rg -i '^\s*\+.*libj9jit' perf-diff.log | wc -l
642
@ThanHenderson Thanks for the analysis. Adding the JIT label since the JIT optimizations don't occur with OJDK MHs enabled. fyi @hzongaro @0xdaryl @vijaysun-omr @jdmpapin @nbhuiyan
@babsingh I am profiling JDK17 now. I will provide updates on that too and see if I can glean anything more.
The ojdk292
flame graph shows that almost the entire time was spent in internalDefineClass()
, and most of that in compareROMClassForEquality()
/writeROMClass()
. I don't think the JIT can speed that up. In particular, I doubt that it runs any Java code. You can also see at the left that the JIT did run (functions starting with "TR", "OMR"). It just took a much smaller proportion of the time, which is understandable given that so much time was spent in the VM's class definition code
@jdmpapin That's right. I think it is on our end still, I'm continuing to investigate.
The ojdk292 flame graph shows that almost the entire time was spent in internalDefineClass(), and most of that in compareROMClassForEquality()/writeROMClass()
Does the issue goes away if you run with option -XX:-ShareAnonymousClasses
and/or -XX:-ShareUnsafeClasses
?
There is a trace point Trc_BCU_compareROMClassForEquality_event
that can be triggered to see what are the classes being compared.
@hangshao0 I think I've gotten it.
The following snippet applies to JDK17 OJDK MHs, but not to 8 or 11 due to 8 and 11's use of defineAnonymousClass
wherein the classname passed in here is NULL
.
https://github.com/eclipse-openj9/openj9/blob/381942ba3c01f837acbb42120e24be29411eaae3/runtime/jcl/common/jcldefine.c#L118-L132
Instead of performing this check and setting the J9_FINDCLASS_FLAG_DO_NOT_SHARE
here, I am pushing it to handleAnonClassName
in ROMClassBuilder
because at that point we have access to a non-NULL
class name for defineAnonymousClass
-created classes.
Internal build jdk_lang_j9_0 java/lang/invoke/MethodHandles/CatchExceptionTest.java