Open zengdage opened 2 weeks ago
@llvm/pr-subscribers-backend-risc-v
Author: Zhijin Zeng (zengdage)
Do you have an end-to-end C language example that will produce these instructions?
Do you have an end-to-end C language example that will produce these instructions?
The following is the minimal code which can produce the bne a0, a0, %bb1
instruction. It's cropped from spec2006/400.perlbench/Base64.c/XS_MIME__QuotedPrint_encode_qp function which can produce bne
instruction too.
#define qp_isplain(c) ((c) == '\t' || (((c) >= ' ' && (c) <= '~') && (c) != '='))
int test_bne(char *beg, char *end)
{
char *p;
char *p_beg;
p = beg;
while (1) {
p_beg = p;
while (p < end && qp_isplain(*p)) {
p++;
}
if (p == end || *p == '\n') {
while (p > p_beg && (*(p - 1) == '\t' || *(p - 1) == ' '))
p--;
}
}
}
Build command:
clang -S -fno-PIC -g -march=rv64imafdc_zba_zbb_zbc_zbs -mabi=lp64d -Ofast -o xxx.S xxx.c
Do you have an end-to-end C language example that will produce these instructions?
The following is the minimal code which can produce the
bne a0, a0, %bb1
instruction. It's cropped from spec2006/400.perlbench/Base64.c/XS_MIME__QuotedPrint_encode_qp function which can producebne
instruction too.#define qp_isplain(c) ((c) == '\t' || (((c) >= ' ' && (c) <= '~') && (c) != '=')) int test_bne(char *beg, char *end) { char *p; char *p_beg; p = beg; while (1) { p_beg = p; while (p < end && qp_isplain(*p)) { p++; } if (p == end || *p == '\n') { while (p > p_beg && (*(p - 1) == '\t' || *(p - 1) == ' ')) p--; } } }
Build command:
clang -S -fno-PIC -g -march=rv64imafdc_zba_zbb_zbc_zbs -mabi=lp64d -Ofast -o xxx.S xxx.c
That compiles to an empty function on compiler explorer https://godbolt.org/z/ss3fqYTo4
Is this a widespread problem? In my local testing I see 2 places in perlbench.
Do you expect a performance gain from fixing this on perlbench?
I'm hesitant to add a new pass for a problem that occurs twice if it doesn't improve performance.
clang -S -fno-PIC -g -march=rv64imafdc_zba_zbb_zbc_zbs -mabi=lp64d -Ofast -o xxx.S xxx.c
Sorry, these are 13 places in my local SPEC2006 perlbench. Build command:
clang -c -DSPEC_CPU -DNDEBUG -DPERL_CORE -fno-pic -DSPEC_CPU_LP64 -Wno-everything -DSPEC_CPU_LINUX_X64 -std=gnu89 -Ofast -march=rv64gc_zba_zbb_zbc_zbs -mabi=lp64d xxx.c -o xxx.o
A list of these instruction from the above build command. If add -mllvm -inline-threshold=3420 -mllvm -unroll-max-count=15
to build, I will get 21 places in my local SPEC2006 perlbench.
mg.c
MI:BEQ $x0, $x0, %bb.17
perlio.c
MI:BEQ $x0, $x0, %bb.18
pp_pack.c
MI:BGEU $x22, renamable $x22, %bb.73
regexec.c
MI:BEQ $x0, $x0, %bb.548
MI:BNE $x0, $x0, %bb.548
MI:BNE $x0, $x0, %bb.56
MI:BEQ $x0, $x0, %bb.102
MI:BEQ $x0, $x0, %bb.215
MI:BNE $x0, $x0, %bb.215
toke.c
MI:BLTU $x10, renamable $x10, %bb.1778
util.c
MI:BNE $x0, $x0, %bb.5
Base64.c
MI:BNE $x27, renamable $x27, %bb.33
And my llvm project based on 93e69abfc77b0bd90f3669e36e510dd4f45aab14
, and I use the following commands to build it.
cmake -DLLVM_PARALLEL_LINK_JOBS=1 -DLLVM_TARGETS_TO_BUILD=RISCV -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lld" -DCMAKE_INSTALL_PREFIX=/toolchain/gitlab_llvm_install/centos -DLLVM_DEFAULT_TARGET_TRIPLE=riscv64-unknown-linux-gnu -DLLVM_USE_LINKER=gold -DLLVM_BINUTILS_INCDIR=/toolchain/gitlab_llvm_install/lto/install/include -G Ninja ../llvm
Is this a widespread problem? In my local testing I see 2 places in perlbench.
Do you expect a performance gain from fixing this on perlbench?
I'm hesitant to add a new pass for a problem that occurs twice if it doesn't improve performance.
Negligible performance gain for perlbench, may be 0.1%.
clang -S -fno-PIC -g -march=rv64imafdc_zba_zbb_zbc_zbs -mabi=lp64d -Ofast -o xxx.S xxx.c
Sorry, these are 13 places in my local SPEC2006 perlbench. Build command:
clang -c -DSPEC_CPU -DNDEBUG -DPERL_CORE -fno-pic -DSPEC_CPU_LP64 -Wno-everything -DSPEC_CPU_LINUX_X64 -std=gnu89 -Ofast -march=rv64gc_zba_zbb_zbc_zbs -mabi=lp64d xxx.c -o xxx.o
A list of these instruction from the above build command. If add
-mllvm -inline-threshold=3420 -mllvm -unroll-max-count=15
to build, I will get 21 places in my local SPEC2006 perlbench.mg.c MI:BEQ $x0, $x0, %bb.17 perlio.c MI:BEQ $x0, $x0, %bb.18 pp_pack.c MI:BGEU $x22, renamable $x22, %bb.73 regexec.c MI:BEQ $x0, $x0, %bb.548 MI:BNE $x0, $x0, %bb.548 MI:BNE $x0, $x0, %bb.56 MI:BEQ $x0, $x0, %bb.102 MI:BEQ $x0, $x0, %bb.215 MI:BNE $x0, $x0, %bb.215 toke.c MI:BLTU $x10, renamable $x10, %bb.1778 util.c MI:BNE $x0, $x0, %bb.5 Base64.c MI:BNE $x27, renamable $x27, %bb.33
My grep of the disassembly missed the beq/bne x0, x0 cases because they are beqz/bnez with the compressed extension and I didn't check for that.
Hello, is this pr worth merging? If not, should I close it?
After
block-placement
andmachine-cp
, the following situations may exist and require optimization. I don't know how to do this optimization, so I try to add a new pass and run it aftermachine-cp
. Maybe this is not the right way to do it.