Open 0xdaryl opened 5 years ago
nestmates
This should be handled by the VM's resolve helpers. I'm not aware of any JIT surface to this feature.
VarHandles
New in Java 9, similar to JSR 292's MethodHandles but for field access under different modes.
JNI Dispatch
In additional to DirectToJNI, there's also the recentish work on atomic free vmaccess.
VarHandles are just like MethodHandles by the time you get to codegen. Nestmates are only seen as part of the resolve paths and not really for perf.
No special support needed for JProfiling - it is all done in trees.
Constant dynamic is transparent in codegen
AOT / SVM?
Validation (SVM): common code
Codegen: This is where the majority of the work would need to be done. The addMetaDataForCode*
method of the various instructions would handle adding the external relocations necessary. However, those are't the only locations; the various snippets are going to have to make sure they add the necessarily external relocations. I'm sure there are lots of other places as well, which could be gleaned by looking at the other existing codegens. The codegen would also need to know when to generate different instructions under AOT (for example, if an address isn't guaranteed to fit in 32 bits, even though it happens to do so during compilation). At the moment there would be the need for yet another J9AheadOfCompile.cpp file; however, once the consolidation work is done (https://github.com/eclipse/openj9/issues/4803) that won't be necessary.
Relocation: TR_RelocationTarget
would need to be extended to handle circumstances like POWER, where there are different ways a pointer can be patched (ie different ways to patch a 5 instruction sequence).
Any other considerations @mstoodle ?
For other SVM changes, also see https://github.com/eclipse-openj9/openj9/pull/15121#issuecomment-1138654246
Complete atomicCompareAndSwapReturn support on Z, Power, and AArch64: eclipse/omr#3759
Field watch: AArch64 (#8038), AArch32 (#8040)
DLT for AArch64. #5917 tracks the work to enable. Disabled in AArch64 by #5919.
OSR for AArch64 #5921
Lock reservation: AArch64 (#12097) Lock reservation optimization: #2344
CompactLocals (ability to map compacted stack in some linkage) : #5910
JITaaS relocations
Quad recognized methods (hopefully not required for long!)
Interface PICs (e.g., #6325)
Hardware transactional memory support. Enable CodeGenerator getSupportsTM()
. Implement evaluators for tstart
, tfinish
, tabort
.
LoadExtensions codegen optimization [1].
I'd like to clarify a few points about nestmates here.
nestmates
This should be handled by the VM's resolve helpers. I'm not aware of any JIT surface to this feature.
The JIT interaction is that with nestmates, invokevirtual
and invokeinterface
no longer necessarily do virtual and interface dispatch (respectively). They might instead call a private method directly, and because private methods are not in the vtable, direct dispatch is the only possible implementation. The JIT compiler recognizes this situation in the resolved case during both IL generation and inlining so that it can treat it as a direct call as required. In the unresolved case, the resolution path has to detect this situation at runtime.
Nestmates are only seen as part of the resolve paths and not really for perf.
This is true, but I want to emphasize that code generator support is required for nestmates to work correctly. The runtime resolution path must be capable of carrying out a direct dispatch when indicated by the VM. The current design for doing so requires a pointer to the "virtual" J2I thunk in the PIC data, which needs a relocation for AOT. In the case of invokeinterface
, we also need to do a type check.
(It was technically possible for invokeinterface
to require direct dispatch before nestmates for final methods of Object
, which are also not kept in the vtable. However, calling those through invokeinterface
is unusual bytecode, and the JIT compiler would simply refuse to compile methods containing such calls. Only with nestmates did it become important to conditionally do direct dispatch based on the result of resolution at runtime.)
Internal pointers. AArch64 (#6367), ARM (#6368)
Arraycopy transformations from value propagation (TR::arraycopy
and TR::ArrayCHK
node support). Disabled via TR_DisableArrayCopyOpts
.
AArch64 (#12122)
Inline dynamic cast class evaluation for checkcast : #5291
Inlining support of MultiANewArray for 2 dimensional arrays: x86 (#2408), P (#2424), Z (#11088), AArch64 (#12367).
CodeGenerator SupportsProfiledInlining
. Currently enabled for X,P,Z. AArch64 (#6451) and ARM32 (#6452).
CodeGenerator SupportsAutoSIMD
. Currently enabled on X,P,Z. AArch64 (#6453) and ARM32 (#6454).
Please note that autoSIMD is supported only to the degree vector opcodes implemented in a particular codegen. Optimizer attempts to vectorize a loop and then asks codegen if particular opcode has vector version (there is a codegen method for that, takes opcode as a parameter). I am pretty sure only x,z,p have some number of opcodes that are supported.
Exception Directed Optimization (EDO)
CodeGenerator support for GlRegDeps:
setSupportsGlRegDeps();
setSupportsGlRegDepOnFirstBlock();
And by extension, global register allocation.
For AArch64 (#6606).
Method recompilation
Support atomic free JNI. #2576, AArch64 (#6608), ARM32 (#6609)
By extension, enable directToJNI in the JIT.
A code generator must set setSupportsDivCheck()
if it provides an implementation for the DIVCHK
IL Opcode. This is needed before Walker will create trees for integer type division and remainder operations. Otherwise, the compilation will fail with an unimplemented opcode for any division or remainder operations.
Enable linkage register allocation. Reverse TR_DisableLinkageRegisterAllocation
. AArch64 (#6657).
Byte reversal recognized methods
String compression.
Implement fast sun_nio_ch_NativeThread_current (#7131)
Implement fast Unsafe compareAndSwap recognized methods: AArch64 (#7132), AArch32 (#7133)
Implement fast versions of java_util_concurrent_atomic_Fences methods: AArch64 (#7134), AArch32 (#7135)
Implement inlined version of sun_misc_Unsafe_copyMemory. #7136
Implement recognized java_nio_Bits_keepAlive and java_lang_ref_Reference_reachabilityFence. AArch64 (#7137), AArch32 (#7138)
Provide inlined versions of recognized rotate methods. AArch64/32: #7139
Codegen Support for atomic method symbols: eclipse/omr#2958.
This should be the strategic means of recognizing, for example, the java/util/concurrent Atomic operations.
CodeGenerator inlining of j/u/c/atomic_Atomic* methods: AArch64 (#12261)
CodeGenerator inlining of j/u/c/atomic_Fences* methods: AArch64 (#12263) (deprecated)
Allow processor exploitation of abs methods. AArch64/32 #7162
Method entry alignment. AArch64/32 OMR eclipse/omr#4377.
Support DynamicANewArray (#6441)
Choose thresholds for arrayTranslate / arrayTranslateAndTest. AArch64 eclipse/omr#4446
[Nestmates] Handle private invoke for jitted unresolved invokevirtual/interface. AArch64 #7462 , ARM32 #7463
Enable class redefinition and flush compilation queue hooks: AArch64 #7470
Enable prefetchInsertion optimization. Requires backend support for prefetch instructions. AArch64 eclipse/omr#4494, ARM32 eclipse/omr#4495
Enable support for compressed references.
Implement lock reservation. AArch64 (#8032), AArch32 (#8033)
Implement support for balanced GC: AArch64 (#8034), AArch32 (#8035)
Implicit NULLCHKs: AArch64 (#8036)
I'm compiling a list of features that a new OpenJ9 JIT backend would need to implement to be reasonably "feature complete" with the others. These features are typically performance features above and beyond a basic, functional OpenJ9 JIT.
This could also serve a secondary purpose as a cross checklist for existing platforms to determine if they are missing any opportunities.
At the moment, I am particularly interested in the work that has been done to support features in Java 9, 11, 12, ... to be sure we don't miss any of that work in the AArch64 implementation.
This is a bit of a brain dump, but I hope to provide some structure to it once everyone has provided their input. I would appreciate it if those that are familiar with a particular backend could review this list and add anything you think is relevant. You don't necessarily have to go into great detail here: it will either be enough for me to track down the feature myself, or I can ask you about it.
FYI @andrewcraik @fjeremic @gita-omr, but input from anyone is welcome.
software concurrent scavenge
constant dynamic
nestmates
read barriers
write barriers
field watch
JNI dispatch
lock reservation
recompilation
on-stack replacement? (@andrewcraik)
AOT / SVM? (@dsouzai)
per code-cache helpers
JSR292?
DLT?
JProfiling?
J9-specific IL opcodes, including:
Platform-specific inlining
inlined helpers
implicit NULLCHK (via signal handler)
implicit DIVCHK (via signal handler)
transactional memory (tstart/tfinish/tabort/tcommit)
what is "asyncCheckGCMapPatching" @0dvictor ?