llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.58k stars 11.81k forks source link

[RISC-V] Miss optimize for loop vsetvli #58834

Open lhtin opened 1 year ago

lhtin commented 1 year ago

Hi,

For the bellow C code, there is a redundant vsetvli instruction in the loop body of the output assembly code. I think this can be removed and also need to pay attention to situations that cannot be removed, such as vl and vtype at the end of the loop being different. Thanks.

C code:

#include <riscv_vector.h>

void foo9 (int8_t *base, int8_t* out, size_t vl, size_t m)
{
    vint8mf8_t v0;
    size_t avl = vsetvl_e8mf8 (vl);

    for (size_t i = 0; i < m; i++)
    {
        v0 = vle8_v_i8mf8 (base + i, avl);
        vse8_v_i8mf8 (out + i, v0, avl);
    }
}

asm:

foo9:                                   # @foo9
        vsetvli a2, a2, e8, mf8, ta, mu
        beqz    a3, .LBB0_2
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        vsetvli zero, a2, e8, mf8, ta, mu
        vle8.v  v8, (a0)
        vse8.v  v8, (a1)
        addi    a3, a3, -1
        addi    a1, a1, 1
        addi    a0, a0, 1
        bnez    a3, .LBB0_1
.LBB0_2:
        ret

compiler explorer: https://godbolt.org/z/P6b7dPWPP

llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-risc-v

lhtin commented 1 year ago

Reproduce for LLVM 16: https://godbolt.org/z/W56Tj7h7K

After analyzing the relevant pass, I think it is because when comparing the equality of two Infos, only the AVLReg is compared, and the AVLReg of one of the Infos may be equal to the DestReg of the other Info (this Info comes from the vsetvli instruction).

This results in PrevInfo being Unknown in the judgment on line 1026 of the code below. This is because when the Exit of all Preds of the Basic Block is intersected, it is Unknown because of different AVLRegs.

https://github.com/llvm/llvm-project/blob/aa99b607b5cf8ef1260f5661dcbf077f26ee797c/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp#L1026-L1037

Here the LLVM IR:

define dso_local void @foo9(ptr nocapture noundef readonly %base, ptr nocapture noundef %out, i64 noundef %vl, i64 noundef %m, <vscale x 1 x i1> %mask) local_unnamed_addr #0 {
entry:
  %0 = tail call i64 @llvm.riscv.vsetvli.i64(i64 %vl, i64 0, i64 5)
  %cmp6.not = icmp eq i64 %m, 0
  br i1 %cmp6.not, label %for.cond.cleanup, label %for.body

for.cond.cleanup:                                 ; preds = %for.body, %entry
  ret void

for.body:                                         ; preds = %entry, %for.body
  %i.08 = phi i64 [ %inc, %for.body ], [ 0, %entry ]
  %v0.07 = phi <vscale x 1 x i8> [ %1, %for.body ], [ undef, %entry ]
  %add.ptr = getelementptr inbounds i8, ptr %base, i64 %i.08
  %1 = tail call <vscale x 1 x i8> @llvm.riscv.vle.mask.nxv1i8.i64(<vscale x 1 x i8> %v0.07, ptr %add.ptr, <vscale x 1 x i1> %mask, i64 %0, i64 1)
  %add.ptr1 = getelementptr inbounds i8, ptr %out, i64 %i.08
  tail call void @llvm.riscv.vse.mask.nxv1i8.i64(<vscale x 1 x i8> %1, ptr %add.ptr1, <vscale x 1 x i1> %mask, i64 %0)
  %inc = add nuw i64 %i.08, 1
  %exitcond.not = icmp eq i64 %inc, %m
  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body, !llvm.loop !6
}

A solution I thought of is: if the Info comes from the vsetvli instruction and the DestReg of the directive is valid, set it to the VLReg field of the Info (field needs to be added). If Info comes from other RVV instructions, when setting its AVLReg, look for the DefReg of the AVLReg at the same time (DefReg needs to come from the corresponding vsetvli instruction), and if so, set it to the DefReg field of Info (field need to be added). Then when judging, if hasSameAVL is false, continue to judge whether the DefReg of one Info is equal to the VLReg of the other Info in the bellow code:

https://github.com/llvm/llvm-project/blob/aa99b607b5cf8ef1260f5661dcbf077f26ee797c/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp#L439-L440

artagnon commented 1 year ago

Candidate patch: https://github.com/llvm/llvm-project/pull/67144