llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.57k stars 11.81k forks source link

[RISC-V]llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed. #110978

Open ChiHungWei opened 2 weeks ago

ChiHungWei commented 2 weeks ago

Ubuntu 24.04.1 LTS on x86_64, LLVM 20.0.0git with riscv-gnu-toolchain installed at /opt/riscv_2

The following problem occurs while cross compiling ggml.c ,

clang: /home/tomlord/workspace/jason/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'ggml/src/ggml.c'.
4.      Running pass 'Prologue/Epilogue Insertion & Frame Finalization' on function '@ggml_compute_forward_conv_transpose_2d'

Compiling flags(under the directory of /llama.cpp): /home/tomlord/workspace/jason/llvm-project/riscv-custom/bin/clang -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -I/home/tomlord/workspace/jason/llvm-project/riscv-custom/include -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -D_GLIBCXX_ASSERTIONS -DGGML_USE_LLAMAFILE --gcc-toolchain=/opt/riscv_2 --sysroot=/opt/riscv_2/sysroot --target=riscv64-unknown-linux-gnu -fopenmp=libomp -v -std=c11 -fuse-ld=lld -fPIC -march=rv64gc -mabi=lp64d -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -pthread -Wunreachable-code-break -Wunreachable-code-return -Wdouble-promotion -c ggml/src/ggml.c -o ggml/src/ggml.o

Reduced .bc file that bugpoint emits: https://github.com/ChiHungWei/For-LLVM-Bug-Report

I've added a custom instruction in llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.td as following,

//===----------------------------------------------------------------------===//
// Instruction class templates
//===----------------------------------------------------------------------===//
let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class ALU_rrr<bits<2> funct2, bits<3> funct3, string opcodestr,
             bit Commutable = 0>
    : RVInstR4<funct2, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2, GPR:$rs3),
              opcodestr, "$rd, $rs1, $rs2, $rs3"> {
let isCommutable = Commutable;
}

//===----------------------------------------------------------------------===//
//Instructions
//===----------------------------------------------------------------------===//
def MULADD : ALU_rrr<0b10, 0b100,"muladd">,
            Sched<[WriteIMul, ReadIMul, ReadIMul]>;

//===----------------------------------------------------------------------===//
//codegen pattern
//===----------------------------------------------------------------------===//
def : Pat< (add (mul (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), GPR:$rs3),
(MULADD GPR:$rs1, GPR:$rs2, GPR:$rs3) >;

Also, in llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp. I add a case inside the switch (Opcode) of function void RISCVDAGToDAGISel::Select(SDNode *Node) to help select the new custom instruction "MULADD",

case ISD::ADD: {
    SDValue opn_0 = Node->getOperand(0);
    SDValue opn_1 = Node->getOperand(1);
    //None of the operand comes from the result of mul
    if(!(opn_0.getOpcode() == ISD::MUL || opn_1.getOpcode() == ISD::MUL)){
      break;
    }

    //operand 0 is the mul node
    else if(opn_0.getOpcode() == ISD::MUL){
      SDValue rs1 = opn_0.getNode()->getOperand(0);
      SDValue rs2 = opn_0.getNode()->getOperand(1);
      SDValue rs3 = opn_1;
      SDNode *mul_node = opn_0.getNode();
      //create a new node with muladd target-specific instruction
      SDNode *muladd = CurDAG->getMachineNode(RISCV::MULADD, DL, VT, rs1,rs2,rs3);
      ReplaceNode(Node, muladd);
      CurDAG->RemoveDeadNode(mul_node);
      return;
    }

    //operand 1 is the mul node
    else {
      SDValue rs1 = opn_1.getNode()->getOperand(0);
      SDValue rs2 = opn_1.getNode()->getOperand(1);
      SDValue rs3 = opn_0;
      SDNode *mul_node = opn_1.getNode();
      //create a new node with muladd target-specific instruction
      SDNode *muladd = CurDAG->getMachineNode(RISCV::MULADD, DL, VT, rs1,rs2,rs3);
      ReplaceNode(Node, muladd);
      CurDAG->RemoveDeadNode(mul_node);
      return;
    }
  }

The newly defined instruction works well in a simple c file like this,

#include <stdlib.h>
#include <stdio.h>

int main() {
 int a = 3;
 int b = 103;
 int c = 127;
 a = a * b + c;
 printf("The answer of 3*103 + 127 = %d", a);
return 0;
}

However, it fails when compilingggml.c as shown above. I've surveyed vendor extension like RISCVInstrInfoXTHead.td and RISCVInstrInfoXVentana.td, and they use roughly the same way defining custom instructions. So I am confused what causes this issue.

llvmbot commented 2 weeks ago

@llvm/issue-subscribers-backend-risc-v

Author: None (ChiHungWei)

Ubuntu 24.04.1 LTS on x86_64, LLVM 20.0.0git with riscv-gnu-toolchain installed at /opt/riscv_2 The following problem occurs while cross compiling [ggml.c](https://github.com/ggerganov/llama.cpp/blob/master/ggml/src/ggml.c) , ``` clang: /home/tomlord/workspace/jason/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:557: int64_t llvm::MachineOperand::getImm() const: Assertion `isImm() && "Wrong MachineOperand accessor"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'ggml/src/ggml.c'. 4. Running pass 'Prologue/Epilogue Insertion & Frame Finalization' on function '@ggml_compute_forward_conv_transpose_2d' ``` Compiling flags(under the directory of `/llama.cpp`): `/home/tomlord/workspace/jason/llvm-project/riscv-custom/bin/clang -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon -I/home/tomlord/workspace/jason/llvm-project/riscv-custom/include -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -D_GLIBCXX_ASSERTIONS -DGGML_USE_LLAMAFILE --gcc-toolchain=/opt/riscv_2 --sysroot=/opt/riscv_2/sysroot --target=riscv64-unknown-linux-gnu -fopenmp=libomp -v -std=c11 -fuse-ld=lld -fPIC -march=rv64gc -mabi=lp64d -O0 -g -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -pthread -Wunreachable-code-break -Wunreachable-code-return -Wdouble-promotion -c ggml/src/ggml.c -o ggml/src/ggml.o` Reduced .bc file that bugpoint emits: [https://github.com/ChiHungWei/For-LLVM-Bug-Report](https://github.com/ChiHungWei/For-LLVM-Bug-Report) I've added a custom instruction in [llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.td](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVInstrInfo.td) as following, ``` //===----------------------------------------------------------------------===// // Instruction class templates //===----------------------------------------------------------------------===// let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in class ALU_rrr<bits<2> funct2, bits<3> funct3, string opcodestr, bit Commutable = 0> : RVInstR4<funct2, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2, GPR:$rs3), opcodestr, "$rd, $rs1, $rs2, $rs3"> { let isCommutable = Commutable; } //===----------------------------------------------------------------------===// //Instructions //===----------------------------------------------------------------------===// def MULADD : ALU_rrr<0b10, 0b100,"muladd">, Sched<[WriteIMul, ReadIMul, ReadIMul]>; //===----------------------------------------------------------------------===// //codegen pattern //===----------------------------------------------------------------------===// def : Pat< (add (mul (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), GPR:$rs3), (MULADD GPR:$rs1, GPR:$rs2, GPR:$rs3) >; ``` Also, in [llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp). I add a case inside the `switch (Opcode)` of function `void RISCVDAGToDAGISel::Select(SDNode *Node)` to help select the new custom instruction "MULADD", ``` case ISD::ADD: { SDValue opn_0 = Node->getOperand(0); SDValue opn_1 = Node->getOperand(1); //None of the operand comes from the result of mul if(!(opn_0.getOpcode() == ISD::MUL || opn_1.getOpcode() == ISD::MUL)){ break; } //operand 0 is the mul node else if(opn_0.getOpcode() == ISD::MUL){ SDValue rs1 = opn_0.getNode()->getOperand(0); SDValue rs2 = opn_0.getNode()->getOperand(1); SDValue rs3 = opn_1; SDNode *mul_node = opn_0.getNode(); //create a new node with muladd target-specific instruction SDNode *muladd = CurDAG->getMachineNode(RISCV::MULADD, DL, VT, rs1,rs2,rs3); ReplaceNode(Node, muladd); CurDAG->RemoveDeadNode(mul_node); return; } //operand 1 is the mul node else { SDValue rs1 = opn_1.getNode()->getOperand(0); SDValue rs2 = opn_1.getNode()->getOperand(1); SDValue rs3 = opn_0; SDNode *mul_node = opn_1.getNode(); //create a new node with muladd target-specific instruction SDNode *muladd = CurDAG->getMachineNode(RISCV::MULADD, DL, VT, rs1,rs2,rs3); ReplaceNode(Node, muladd); CurDAG->RemoveDeadNode(mul_node); return; } } ``` The newly defined instruction works well in a simple c file like this, ``` #include <stdlib.h> #include <stdio.h> int main() { int a = 3; int b = 103; int c = 127; a = a * b + c; printf("The answer of 3*103 + 127 = %d", a); return 0; } ``` However, it fails when compiling` ggml.c ` as shown above. I've surveyed vendor extension like [RISCVInstrInfoXTHead.td](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVInstrInfoXTHead.td) and [RISCVInstrInfoXVentana.td](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVInstrInfoXVentana.td), and they use roughly the same way defining custom instructions. So I am confused what causes this issue.
topperc commented 2 weeks ago

I don't see anything obviously wrong with your code.

The code in RISCVISelDAGToDAG.cpp shouldn't be necessary. It does the same thing as the def : Pat< (add (mul (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), GPR:$rs3), (MULADD GPR:$rs1, GPR:$rs2, GPR:$rs3) >; in the .td file. But that shouldn't have caused the crash you're seeing.