riscv-non-isa / rvv-intrinsic-doc

https://jira.riscv.org/browse/RVG-153
BSD 3-Clause "New" or "Revised" License
277 stars 88 forks source link

How to use LMUL in rvv-intrinsic? #332

Closed Erucaaa closed 2 months ago

Erucaaa commented 2 months ago

Here are my code:

        int32_t a[16] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
        size_t vl = vsetvl_e16m1 (4);
        vint32m1_t t0 = vle32_v_i32m1(ee,vl);
        vint32m1_t t1 = vle32_v_i32m1(ee+4,vl);
        vint32m1_t t2 = vle32_v_i32m1(ee+8,vl);
        vint32m1_t t3 = vle32_v_i32m1(ee+12,vl);
        vint32m4_t group = vcreate_v_i32m1_i32m4(t0,t1,t2,t3);

I want to use a group of vector registers to perform more vectorized operations. However, an error occurred, and it seems that this instruction does not exist. The instruction vset_v_i32m1_i32m4 is not the one i'm looking for. Are there any other instructions that can be used as a substitute?

图片

kito-cheng commented 2 months ago

Maybe just use vsetvl_e16m4 and vle32_v_i32m4?

        int32_t a[16] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
        size_t vl = vsetvl_e16m4 (16);
        vint32m4_t t0 = vle32_v_i32m4(ee,vl);
topperc commented 2 months ago

vcreate was removed from the spec for a period of time. It was added back in https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/286

Your intrinsics aren't prefixed with __riscv_ so it looks like you're using an older implementation?

If you are using an older implementation, you should be able to use multiple vset intrinsics.

Erucaaa commented 2 months ago

Maybe just use vsetvl_e16m4 and vle32_v_i32m4?

        int32_t a[16] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
        size_t vl = vsetvl_e16m4 (16);
        vint32m4_t t0 = vle32_v_i32m4(ee,vl);

Thanks, But i may need to deal with each vector register separately and then build a group to store them back.

Erucaaa commented 2 months ago

vcreate was removed from the spec for a period of time. It was added back in #286

Your intrinsics aren't prefixed with __riscv_ so it looks like you're using an older implementation?

If you are using an older implementation, you should be able to use multiple vset intrinsics.

Thanks. vset requires a group of registers as parameters. So, we're back to this question again: how do you build a group of registers in intrinsics.... 图片

topperc commented 2 months ago

You can use vundefined for the first vset.

vint32m4_t A;
A = __riscv_vset_v_i32_m1_i32m4(__riscv_vundefined_i32m4(), 0, t0);
A = __riscv_vset_v_i32_m1_i32m4(A, 1, t1);
A = __riscv_vset_v_i32_m1_i32m4(A, 2, t2);
A = __riscv_vset_v_i32_m1_i32m4(A, 3, t3);
Erucaaa commented 2 months ago

You can use vundefined for the first vset.

vint32m4_t A;
A = __riscv_vset_v_i32_m1_i32m4(__riscv_vundefined_i32m4(), 0, t0);
A = __riscv_vset_v_i32_m1_i32m4(A, 1, t1);
A = __riscv_vset_v_i32_m1_i32m4(A, 2, t2);
A = __riscv_vset_v_i32_m1_i32m4(A, 3, t3);

Thanks!!! I got the true result!