llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.32k stars 12.12k forks source link

[RISCV] Compiler abend when compiling for rv64 with Zb due to instructions producing i32 #81192

Closed nemanjai closed 9 months ago

nemanjai commented 9 months ago

This test case:

define void @func(i32 %0) #0 {
entry:
  %mul = mul i32 %0, 6
  %conv = sext i32 %mul to i64
  tail call void null(i64 0, i64 %conv, ptr null, i32 0)
  ret void
}

attributes #0 = { "target-features"="+64bit,+m,+zba" }

Causes a segfault in the compiler when using -pre-RA-sched=list-ilp as the register pressure calculation tries to get the register class ID for the i32 produced by SH1ADD and used by SLLIW.

Invocation:

llc -mcpu=generic-rv64 -march=riscv64  -pre-RA-sched=list-ilp
llvmbot commented 9 months ago

@llvm/issue-subscribers-backend-risc-v

Author: Nemanja Ivanovic (nemanjai)

This test case: ``` define void @func(i32 %0) #0 { entry: %mul = mul i32 %0, 6 %conv = sext i32 %mul to i64 tail call void null(i64 0, i64 %conv, ptr null, i32 0) ret void } attributes #0 = { "target-features"="+64bit,+m,+zba" } ``` Causes a segfault in the compiler when using `-pre-RA-sched=list-ilp` as the register pressure calculation tries to get the register class ID for the `i32` produced by `SH1ADD` and used by `SLLIW`. Invocation: ``` llc -mcpu=generic-rv64 -march=riscv64 -pre-RA-sched=list-ilp ```
nemanjai commented 9 months ago

@topperc @benshi001 I think this is produced by code that was added in 9e5c5afc7.

topperc commented 9 months ago

This fixes it

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
index f8938c0a98d1..d5fcea17d776 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
@@ -712,7 +712,7 @@ def : Pat<(add (XLenVT GPR:$r), CSImm12MulBy8:$i),
                   GPR:$r)>;

 def : Pat<(mul (XLenVT GPR:$r), C3LeftShift:$i),
-          (SLLI (SH1ADD GPR:$r, GPR:$r),
+          (SLLI (XLenVT (SH1ADD GPR:$r, GPR:$r)),
                 (TrailingZeros C3LeftShift:$i))>;
 def : Pat<(mul (XLenVT GPR:$r), C5LeftShift:$i),
           (SLLI (SH2ADD GPR:$r, GPR:$r),

We just need to add an explicit cast to inner instructions in the isel patterns.

nemanjai commented 9 months ago

Thanks for the quick response. Do you plan to commit this (presumably there are other patterns in addition to this) or do you want me to?

asb commented 9 months ago

It seems really easy to miss adding these explicit casts. I don't suppose anyone has any bright ideas on how we could add a test or assert to catch it?

nemanjai commented 9 months ago

It seems really easy to miss adding these explicit casts. I don't suppose anyone has any bright ideas on how we could add a test or assert to catch it?

Perhaps a check for any illegal types in RISCVDAGToDAGISel::PostprocessISelDAG() would do it?

nemanjai commented 9 months ago

I'm curious why the RISC-V 32/64 bit instructions weren't defined similarly to PPC with base instruction and one suffixed with 8 (for 64-bit). It would have avoided issues such as this - although some patterns would need to be specified twice.

asb commented 8 months ago

I'm curious why the RISC-V 32/64 bit instructions weren't defined similarly to PPC with base instruction and one suffixed with 8 (for 64-bit). It would have avoided issues such as this - although some patterns would need to be specified twice.

It was precisely to avoid duplicating patterns and C++ logic that refers to the instructions. Arguably it was a better to defend decision in the earlier days of the backend when the saved complexity vs the overall backend size was larger, and before we started working through codegen challenges related to this.

nemanjai commented 8 months ago

I hit another one of these and it seems I need something like this to fix it:

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index 2ac8f5a..678aeb1 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -1514,7 +1514,7 @@ multiclass BccPat<CondCode Cond, RVInstB Inst> {

 class BrccCompressOpt<CondCode Cond, RVInstB Inst>
     : Pat<(riscv_brcc GPR:$lhs, simm12_no6:$Constant, Cond, bb:$place),
-          (Inst (ADDI GPR:$lhs, (NegImm simm12:$Constant)), (XLenVT X0), bb:$place)>;
+          (Inst (XLenVT (ADDI GPR:$lhs, (NegImm simm12:$Constant))), (XLenVT X0), bb:$place)>;

 defm : BccPat<SETEQ, BEQ>;
 defm : BccPat<SETNE, BNE>;

I tried adding a check as follows in the DAG post-processing:

diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 1d78b47..676c62e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -143,6 +143,20 @@ void RISCVDAGToDAGISel::PostprocessISelDAG() {

     MadeChange |= doPeepholeSExtW(N);
     MadeChange |= doPeepholeMaskedRVV(N);
+
+#ifndef _NDEBUG
+    // Assert no nodes produce an i32 type in 64-bit mode as that is
+    // not a legal type.
+    if (Subtarget->is64Bit()) {
+      for (int i = 0, e = N->getNumValues(); i < e; i++) {
+        if (N->getValueType(i) == MVT::i32) {
+          dbgs() << "This node produces an illegal type:\n";
+          N->dumpr(CurDAG);
+          llvm_unreachable("illegal type found");
+        }
+      }
+    }
+#endif
   }

   CurDAG->setRoot(Dummy.getValue());

which blows up all over the place with test/CodeGen/RISCV. Many of them seem legit, but I'm not sure they all are...