eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

getAndSet not inlined for Java.util.concurrency.atomic.Atomiclong and AtomicLongArray #20378

Open matthewhall2 opened 1 week ago

matthewhall2 commented 1 week ago

On Z in J9CodeGenerator::inlineDirectCall, the atomics for getAndSet for AtomicLong and AtomicLongArray are not being inlined, even though they are being inlined for the integer versions (AtomicInteger and AtomicIntegerArray).

getAndSet for Long types appears to be covered in Z::J9TreeEvaluator::inlineAtomicOps, and are included J9RecognizedMethods.enum. It looks like it may have just been missed in J9CodeGenerator::inlineDirectCall

@Spencer-Comin

github-actions[bot] commented 1 week ago

Issue Number: 20378 Status: Open Recommended Components: comp:vm, comp:gc, comp:test Recommended Assignees: keithc-ca, hangshao0, gacholio

Spencer-Comin commented 2 days ago

The atomic operations in java.util.concurrent.atomic.Atomic* classes are not marked as intrinsics. The JCL implementations call out to the atomic operations in Unsafe, which are marked as intrinsics.

Just poked through the code and ran some tests to see what we are doing:

============================================================
; Live regs: GPR=2 FPR=0 VRF=0 {&GPR_0033, GPR_0032}
------------------------------
 n6n      (  0)  NULLCHK on n41n [#32]                                                                [     0x3ff7da847d0] bci=[-1,2,6] rc=0 vc=172 vn=- li=2 udi=- nc=1
 n5n      (  2)    icall  java/util/concurrent/atomic/AtomicInteger.getAndSet(I)I[#403  final virtual Method -136] [flags 0x20500 0x0 ]  [     0x3ff7da84780] bci=[-1,2,6] rc=2 vc=172 vn=- li=2 udi=- nc=2
 n41n     (  1)      ==>aRegLoad (in &GPR_0033) (SeenRealReference )
 n40n     (  1)      ==>iRegLoad (in GPR_0032) (cannotOverflow SeenRealReference )
------------------------------
Instruction 000003FF7DB52080 throws an implicit NPE, node: 000003FF7DA84780 NPE node: 000003FF7DA852C0
------------------------------
 n6n      (  0)  NULLCHK on n41n [#32]                                                                [     0x3ff7da847d0] bci=[-1,2,6] rc=0 vc=172 vn=- li=2 udi=- nc=1
 n5n      (  1)    icall  java/util/concurrent/atomic/AtomicInteger.getAndSet(I)I[#403  final virtual Method -136] [flags 0x20500 0x0 ] (in GPR_0034)  [     0x3ff7da84780] bci=[-1,2,6] rc=1 vc=172 vn=- li=2 udi=- nc=2
 n41n     (  0)      ==>aRegLoad (in &GPR_0033) (X!=0 SeenRealReference )
 n40n     (  0)      ==>iRegLoad (in GPR_0032) (cannotOverflow SeenRealReference )
------------------------------

 [     0x3ff7db51ce0]                          CGIT    &GPR_0033,0,BH(mask=0x8),
 [     0x3ff7db52080]                          LG      GPR_0034,#405 8(&GPR_0033)
 [     0x3ff7dc00000]                          Label L0033:     # (Start of internal control flow)
 [     0x3ff7dc001a0]                          CS      GPR_0034,GPR_0032,#406 8(&GPR_0033)
 [     0x3ff7dc00280]                          BRC     BM(0x4), Label L0033
 [     0x3ff7dc00920]                          assocreg
 [     0x3ff7dc00360]                          Label L0032:     # (End of internal control flow)
 POST:
 {AssignAny:&GPR_0033:R} {AssignAny:GPR_0034:R} {AssignAny:GPR_0032:R}
============================================================
; Live regs: GPR=2 FPR=0 VRF=0 {&GPR_0033, GPR_0032}
------------------------------
 n6n      (  0)  NULLCHK on n41n [#32]                                                                [     0x3ff58e047d0] bci=[-1,2,6] rc=0 vc=172 vn=- li=2 udi=- nc=1
 n5n      (  2)    lcall  java/util/concurrent/atomic/AtomicLong.getAndSet(J)J[#403  final virtual Method -136] [flags 0x20500 0x0 ]  [     0x3ff58e04780] bci=[-1,2,6] rc=2 vc=172 vn=- li=2 udi=- nc=2
 n41n     (  1)      ==>aRegLoad (in &GPR_0033) (SeenRealReference )
 n40n     (  1)      ==>lRegLoad (in GPR_0032) (cannotOverflow SeenRealReference )
------------------------------

buildDirectDispatch
Build Direct Call

GC Point at 000003FF58F80BD0 has preserved register map 1fc0
------------------------------
 n6n      (  0)  NULLCHK on n41n [#32]                                                                [     0x3ff58e047d0] bci=[-1,2,6] rc=0 vc=172 vn=- li=2 udi=- nc=1
 n5n      (  1)    lcall  java/util/concurrent/atomic/AtomicLong.getAndSet(J)J[#403  final virtual Method -136] [flags 0x20500 0x0 ] (in GPR_0034)  [     0x3ff58e04780] bci=[-1,2,6] rc=1 vc=172 vn=- li=2 udi=- nc=2
 n41n     (  0)      ==>aRegLoad (in &GPR_0033) (X!=0 SeenRealReference )
 n40n     (  0)      ==>lRegLoad (in GPR_0032) (cannotOverflow SeenRealReference )
------------------------------

 [     0x3ff58ed1ce0]                          CGIT    &GPR_0033,0,BH(mask=0x8),
 [     0x3ff58f811c0]                          assocreg
  PRE:
 {GPR1:&GPR_0033:R} {GPR2:GPR_0032:R}
 [     0x3ff58f80bd0]                          BRASL   GPR_0039, 0x000003FF58F80B60
 POST:
 {GPR2:GPR_0034:D} {GPR0:D_GPR_0035:D}* {GPR1:D_GPR_0036:D}* {GPR3:D_GPR_0037:D}* {GPR4:D_GPR_0038:D}* {GPR14:GPR_0039:D} {FPR0:D_FPR_0040:D}* {FPR1:D_FPR_0041:D}*
 {FPR2:D_FPR_0042:D}* {FPR3:D_FPR_0043:D}* {FPR4:D_FPR_0044:D}* {FPR5:D_FPR_0045:D}* {FPR6:D_FPR_0046:D}* {FPR7:D_FPR_0047:D}* {FPR8:D_FPR_0048:D}* {FPR9:D_FPR_0049:D}*
 {FPR10:D_FPR_0050:D}* {FPR11:D_FPR_0051:D}* {FPR12:D_FPR_0052:D}* {FPR13:D_FPR_0053:D}* {FPR14:D_FPR_0054:D}* {FPR15:D_FPR_0055:D}* {VRF16:D_VRF_0056:D}* {VRF17:D_VRF_0057:D}*
 {VRF18:D_VRF_0058:D}* {VRF19:D_VRF_0059:D}* {VRF20:D_VRF_0060:D}* {VRF21:D_VRF_0061:D}* {VRF22:D_VRF_0062:D}* {VRF23:D_VRF_0063:D}* {VRF24:D_VRF_0064:D}* {VRF25:D_VRF_0065:D}*
 {VRF26:D_VRF_0066:D}* {VRF27:D_VRF_0067:D}* {VRF28:D_VRF_0068:D}* {VRF29:D_VRF_0069:D}* {VRF30:D_VRF_0070:D}* {VRF31:D_VRF_0071:D}*

It looks like we are suppressing the IL inlining of AtomicInteger.getAndSet and AtomicLong.getAndSet, then in the case of AtomicInteger.getAndSet we are recognizing the call site and generating some custom assembly. Since the Atomic* class methods are not intrinsics, we should probably not be suppressing inlining these methods and instead rely on the Unsafe atomic intrinsic implementations.