riscv-non-isa / rvv-intrinsic-doc

https://jira.riscv.org/browse/RVG-153
BSD 3-Clause "New" or "Revised" License
284 stars 89 forks source link

the wrong result of "vmerge_vvm_i32m1" #316

Closed Erucaaa closed 5 months ago

Erucaaa commented 6 months ago

i try to test the intrinsic "vmerge_vvm_i32m1", but i got the wrong result. There are my test code as follows:

void col_trn4(int32_t *a) {

vint32m1_t v0,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18; vint32m1_t v0temp,v1temp,v2temp,v3temp,v4temp,v5temp,v6temp,v7temp,v8temp,v9temp,v10temp; size_t vl = vsetvl_e32m1 (4);

vuint32m1_t col_index = vle32_v_u32m1(index_array, vl);

v0 = vle32_v_i32m1(a,vl); v1 = vle32_v_i32m1(a+4,vl); v2 = vle32_v_i32m1(a+8,vl); v3 = vle32_v_i32m1(a+12,vl);

 // cacul mask

int32_t flags[4] = {2,1,2,1}; vint32m1_t v = vle32_v_i32m1 ( flags , vl ) ; vbool32_t mask = vmseq_vx_i32m1_b32(v , 1 , vl ) ;//0,1,0,1

v4 = vmerge_vvm_i32m1(mask,v0,v2,vl); v5 = vmerge_vvm_i32m1(mask,v1,v3,vl); v6 = vmerge_vvm_i32m1(mask,v2,v0,vl); vse32_v_i32m1(a, v4, vl); vse32_v_i32m1(a+4, v5, vl);

vse32_v_i32m1(a+8, v6, vl);

v7 = vmerge_vvm_i32m1(mask,v3,v1,vl);

} the test array : a ={6,30,10,26, 18,45,29,30, 29,48,34,33, 36,53,40,49} The result is: v4: 6 48 10 33 v5: 18 53 29 49 but when i add a instruction"vse32_v_i32m1(a+8, v6, vl)", the result of v4 is changed! v4: 6 30 10 26 v5: 18 53 29 49 v6: 29 48 34 33 WHY? I'm so confused about this.

topperc commented 6 months ago

I don't see anything obviously wrong. What compiler are you using?

Erucaaa commented 6 months ago

I don't see anything obviously wrong. What compiler are you using?

/opt/gcc10.2/native/lib/gcc/riscv64-linux-gnu/10.2.0/specs

but when I set a new variable array b to store the result, the print is true.....

int32_t b[16] = {0}; vse32_v_i32m1(b, v4, vl); vse32_v_i32m1(b+4, v5, vl); vse32_v_i32m1(b+8, v6, vl);

topperc commented 6 months ago

I didn't know that gcc 10.2 supported RISC-V vector intrinsics, but I'm most familiar with clang.

zhongjuzhe commented 6 months ago

It seems that you are using old and obsolete RVV intrinsic API.

For example, you should use __riscv_vse32_v_i32m1 instead of vse32_v_i32m1.

To get latest and stable RVV feature, you should use latest GCC (GCC-14).

It's simple, replace "gcc" directory in https://github.com/riscv-collab/riscv-gnu-toolchain with https://github.com/gcc-mirror/gcc

Then build it. You will get latest RVV support.

kito-cheng commented 5 months ago

It seems a toolchain implementation bug and also it's not an upstream toolchain, so close.