eclipse / omr

Eclipse OMR™ Cross platform components for building reliable, high performance language runtimes
http://www.eclipse.org/omr
Other
934 stars 392 forks source link

AArch64: Improve *selectEvaluator() #5450

Open knn-k opened 3 years ago

knn-k commented 3 years ago

When the first child node of iselect is icmpeq for example, the instructions can be improved as shown below.

// Rd = (Ra == Rb) ? Re : Rf
cmpx Ra, Rb
cset Rc, eq
cmpimmx Rc, 0
cselx Rd, Re, Rf, ne

cmpx Ra, Rb
cselx Rd, Re, Rf, eq
knn-k commented 3 years ago

fselectEvaluator() uses cbnzx instruction now. There is fcsel instruction for floating-point conditional select.

knn-k commented 3 years ago

I opened #5456 for fselectEvaluator().

knn-k commented 4 weeks ago

I opened PR #7361.

The trace file in https://github.com/eclipse-openj9/openj9/issues/19576#issuecomment-2138456343 contains thousands of aselect nodes of the following form, and the PR improves the generated code:

 n55053n  (  0)    aselect (in &GPR_0477) ()                                                          [       0x12a862800] bci=[61,0,4054] rc=0 vc=8 vn=- li=765 udi=31568 nc=3 flg=0x20
 n55052n  (  0)      acmpeq (in GPR_0474)                                                             [       0x12a8627b0] bci=[61,0,4054] rc=0 vc=8 vn=- li=765 udi=30496 nc=2
 n55051n  (  0)        aload  <temp slot 13>[#1851  Auto] [flags 0x20004007 0x0 ] (in &GPR_0475)      [       0x12a862760] bci=[61,0,4054] rc=0 vc=8 vn=- li=765 udi=30608 nc=0
 n55030n  (  5)        loadaddr  <temp slot 65>[#4370  Auto] [flags 0x60000008 0x0 ] (in &GPR_0476) (highWordZero Unsigned X!=0 cannotOverflow nodePointsToNonNull cannotTrackLocalUses escapesInColdBlock )  [       0x12a8620d0] bci=[61,0,4054] rc=5 vc=8 vn=- li=765 udi=31024 nc=0 flg=0xd004
 n79011n  (  6)      ==>aRegLoad (in &GPR_0468) (SeenRealReference )
 n55051n  (  0)      ==>aload (in &GPR_0475)
------------------------------
 [       0x12fa17a30]   0       cmpx    &GPR_0475, &GPR_0476
 [       0x12fa17ac0]   0       cset    GPR_0474, eq
 [       0x12fa17bc0]   0       cmpimmx         GPR_0474, 0
 [       0x12fa17c50]   0       cselx   &GPR_0477, &GPR_0468, &GPR_0475, ne

Generated code after applying the PR (4->2 instructions):

  cmpx  &GPR_0475, &GPR_0476
  cselx &GPR_0477, &GPR_0468, &GPR_0475, eq
knn-k commented 4 weeks ago

PR #7361 does not support the following cases, but they do not appear frequently:

knn-k commented 3 weeks ago

I opened PR #7367 for handling nodes like icmpne and acmpgt in select.