Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

X86: Tail calls with function pointer indirection not folded when more than 1 parameter is passed #49011

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50042
Status NEW
Importance P enhancement
Reported by Nick Zavaritsky (mejedi@gmail.com)
Reported on 2021-04-20 10:59:10 -0700
Last modified on 2021-05-02 04:43:41 -0700
Version trunk
Hardware PC All
CC craig.topper@gmail.com, jhaberman@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, mejedi@gmail.com, pengfei.wang@intel.com, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments jmpq_fold.ll (2160 bytes, text/plain)
dispatch1.png (168300 bytes, image/png)
dispatch15.png (135385 bytes, image/png)
Blocks
Blocked by
See also
Created attachment 24774
reproducer

The following snippet produces the assembly with function pointer table
indexing folded into JMP:

typedef void (*FnTy3) (void *);

void dispatch15(FnTy3 *tab, intptr_t i) {
  tab[i](tab);
}

_dispatch15:                            ## @dispatch15
## %bb.0:                               ## %entry
    jmpq    *(%rdi,%rsi,8)                  ## TAILCALL

If the called function takes at least 2 parameters, indexing isn't folded.

typedef void (*FnTy2) (void *, intptr_t);

void dispatch1(FnTy2 *tab, intptr_t i) {
  tab[i](tab, i);
}

_dispatch1:                             ## @dispatch1
## %bb.0:                               ## %entry
    movq    (%rdi,%rsi,8), %rax
    jmpq    *%rax                           ## TAILCALL
Quuxplusone commented 3 years ago

Attached jmpq_fold.ll (2160 bytes, text/plain): reproducer

Quuxplusone commented 3 years ago
dipatch1 and dispatch15 have a different DAG shape after
X86DAGToDAGISel::PreprocessISelDAG (see attachments). There's an optimisation
to "try moving call address load from outside callseq_start to just before the
call to allow it to be folded." It doesn't handle a multi-parameter case for
tail calls.

The following patch fixes it for me:

@@ -778,11 +778,20 @@ static bool isCalleeLoad(SDValue Callee, SDValue &Chain,
bool HasCallSeq) {
       LD->getExtensionType() != ISD::NON_EXTLOAD)
     return false;

-  // Now let's find the callseq_start.
-  while (HasCallSeq && Chain.getOpcode() != ISD::CALLSEQ_START) {
-    if (!Chain.hasOneUse())
-      return false;
-    Chain = Chain.getOperand(0);
+  if (HasCallSeq) {
+    // Now let's find the callseq_start.
+    while (Chain.getOpcode() != ISD::CALLSEQ_START) {
+      if (!Chain.hasOneUse())
+        return false;
+      Chain = Chain.getOperand(0);
+    }
+  } else if (Chain.getOpcode() == ISD::CopyToReg) {
+    // Locate first CopyToReg in the sequence of CopyToReg-s.
+    while (Chain.getOperand(0).getOpcode() == ISD::CopyToReg) {
+      if (!Chain.hasOneUse())
+        return false;
+      Chain = Chain.getOperand(0);
+    }
   }
Quuxplusone commented 3 years ago

Attached dispatch1.png (168300 bytes, image/png): dispatch1 DAG after X86DAGToDAGISel::PreprocessISelDAG

Quuxplusone commented 3 years ago

Attached dispatch15.png (135385 bytes, image/png): dispatch15 DAG after X86DAGToDAGISel::PreprocessISelDAG

Quuxplusone commented 3 years ago

https://reviews.llvm.org/D101718