Open chihinko opened 2 years ago
Yeah, rvv-next currently by default VLA autovectorization with -O2 which should run on VLENB >= 128. Since the original framework base on ARM SVE which only support vector length >= 128. Well, it has been fixed by me in the latest upstream GCC13 and I working on pushing && refining codes of rvv-next into GCC upstream. It will not be an issue in upstream GCC. However, upstream GCC is far from support all RVV features. sifive@kito-cheng is helping me with reviewing && merging RVV codes in upstream GCC. To solve your problem, would you mind send a PR adding -mrvv compile option back ? Then I merge it. It's simple add MASK (RVV) in riscv.opt and disable them in TARGET_VECTORIZE_PREFERRED_SIMD_MODE target hook implementation.
There is another problem: Since -O2 means generating vector codes, I used -O1 to build newlib library, it works fine, no vector assembly codes in newlib library. I was able to run tests with spike. But used same idea to build glibc, there are a lot of vector assembly codes generated in glibc library: grep csrr *.dump 10546: c20026f3 csrr a3,vl 116b0: c22022f3 csrr t0,vlenb 122be: c20026f3 csrr a3,vl 122fe: c20026f3 csrr a3,vl
I tried -O0, but build failed when compiling glibc/time/tzfile.c How can I work around this problem ?
library newlib is built with -O2 by default, the -O2 option imply auto-vectorization -mriscv-vector-bits=128, this would make library code generate vector code that only work for spike --varch=vlen:128,elen:64,slen:128 when execute.
Here is an example: riscv64-unknown-elf-gcc -O2 -mriscv-vector-bits=64 tmp.c -o tmp.64.x spike --varch=vlen:64,elen:32,slen:64 --isa=RV64IMAFDCV pk tmp.64.x b = 0.000 c = 0.000 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 b = 1.380 c = 1.380 should be b = 0.000 c = 0.000 b = 1.000 c = 2.000 b = 2.000 c = 4.000 b = 3.000 c = 6.000 b = 4.000 c = 8.000 b = 5.000 c = 10.000 b = 6.000 c = 12.000 b = 7.000 c = 14.000 b = 8.000 c = 16.000 b = 9.000 c = 18.000
tmp.c.gz