llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.62k stars 11.83k forks source link

[RISCV] Missing fold with cascade shifts when compiling with zbb #101040

Closed dtcxzyw closed 2 months ago

dtcxzyw commented 2 months ago

Reproducer (sampled from pybind11): https://godbolt.org/z/rc95a37ej

; bin/llc -mtriple=riscv64 -mattr=+zbb test.ll -o -
define i64 @func0000000000000005(i64 %0, i16 signext %1) #0 {
entry:
  %2 = shl i16 %1, 9
  %sext = ashr i16 %2, 15
  %3 = sext i16 %sext to i64
  %4 = add nsw i64 %3, %0
  ret i64 %4
}

With zbb:

func0000000000000005:                   # @func0000000000000005
        slli    a1, a1, 9
        slli    a1, a1, 48
        srai    a1, a1, 63
        add     a0, a1, a0
        ret

Without zbb:

func0000000000000005:                   # @func0000000000000005
        slli    a1, a1, 57
        srai    a1, a1, 63
        add     a0, a1, a0
        ret
Optimized legalized selection DAG: %bb.0 'func0000000000000005:entry'
SelectionDAG has 16 nodes:
  t0: ch,glue = EntryToken
              t4: i64,ch = CopyFromReg t0, Register:i64 %1
            t6: i64 = AssertSext t4, ValueType:ch:i16
          t19: i64 = shl t6, Constant:i64<9>
        t20: i64 = sign_extend_inreg t19, ValueType:ch:i16
      t21: i64 = sra t20, Constant:i64<15>
      t2: i64,ch = CopyFromReg t0, Register:i64 %0
    t15: i64 = add nsw t21, t2
  t17: ch,glue = CopyToReg t0, Register:i64 $x10, t15
  t18: ch = RISCVISD::RET_GLUE t17, Register:i64 $x10, t17:1

===== Instruction selection begins: %bb.0 'entry'

As we discussed in https://github.com/llvm/llvm-project/pull/100966#issuecomment-2255041313, sra (sign_extend_inreg X), C can be folded into sra + shl in DAGCombine.

llvmbot commented 2 months ago

@llvm/issue-subscribers-backend-risc-v

Author: Yingwei Zheng (dtcxzyw)

Reproducer (sampled from pybind11): https://godbolt.org/z/rc95a37ej ``` ; bin/llc -mtriple=riscv64 -mattr=+zbb test.ll -o - define i64 @func0000000000000005(i64 %0, i16 signext %1) #0 { entry: %2 = shl i16 %1, 9 %sext = ashr i16 %2, 15 %3 = sext i16 %sext to i64 %4 = add nsw i64 %3, %0 ret i64 %4 } ``` With zbb: ``` func0000000000000005: # @func0000000000000005 slli a1, a1, 9 slli a1, a1, 48 srai a1, a1, 63 add a0, a1, a0 ret ``` Without zbb: ``` func0000000000000005: # @func0000000000000005 slli a1, a1, 57 srai a1, a1, 63 add a0, a1, a0 ret ``` ``` Optimized legalized selection DAG: %bb.0 'func0000000000000005:entry' SelectionDAG has 16 nodes: t0: ch,glue = EntryToken t4: i64,ch = CopyFromReg t0, Register:i64 %1 t6: i64 = AssertSext t4, ValueType:ch:i16 t19: i64 = shl t6, Constant:i64<9> t20: i64 = sign_extend_inreg t19, ValueType:ch:i16 t21: i64 = sra t20, Constant:i64<15> t2: i64,ch = CopyFromReg t0, Register:i64 %0 t15: i64 = add nsw t21, t2 t17: ch,glue = CopyToReg t0, Register:i64 $x10, t15 t18: ch = RISCVISD::RET_GLUE t17, Register:i64 $x10, t17:1 ===== Instruction selection begins: %bb.0 'entry' ``` As we discussed in https://github.com/llvm/llvm-project/pull/100966#issuecomment-2255041313, `sra (sign_extend_inreg X), C` can be folded into `sra + shl` in DAGCombine.