riscv-non-isa / rvv-intrinsic-doc

https://jira.riscv.org/browse/RVG-153
BSD 3-Clause "New" or "Revised" License
281 stars 89 forks source link

[Question] Combining two vector registers with different LMUL #282

Closed Tameem97 closed 11 months ago

Tameem97 commented 11 months ago

Hi,

It is already discussed in issue #28 but I still can't figure out how to combine two vector. More detail about this is that I want to combine two vuint8mf4_t type vector into single vuint8mf2_t so I can further perform multiplication For example, I have two vector

        vuint8mf4_t v1   = __riscv_vle8_v_u8mf4(arr_1, 8);
        vuint8mf4_t v2   = __riscv_vle8_v_u8mf4(arr_2, 8);

Now I want to combine them into v3 such that v3 first half consist of v1 and second half has v2

        vuint8mf2_t v3 = {v2 : v1};

The method I already search were vset which is only available for segment registers, and vlmul, zext. I required this for multiplication else I have to load vector (which is required to multiply with v3) multiple time in smaller chunk.

Furthermore, this is similar to ARM Neon intrinsics vcombine (Link)

Thanks

dzaima commented 11 months ago
vuint8mf2_t ext1 = __riscv_vlmul_ext_v_u8mf4_u8mf2(v1);
vuint8mf2_t ext2 = __riscv_vlmul_ext_v_u8mf4_u8mf2(v2);
size_t vl = __riscv_vsetvlmax_e8mf2();
vuint8mf2_t res = __riscv_vslideup_vx_u8mf2(ext1, ext2, vl/2, vl);

Though, if you were to use LMUL ≥ 1, e.g. with inputs of vuint8m1_t v1, v2 you could

vuint8m2_t ext1 = __riscv_vlmul_ext_v_u8m1_u8m2(v1);
vuint8m2_t res = __riscv_vset_v_u8m1_u8m2(ext1, 1, v2);

But I would guess that there probably is some alternative way to achieve what you want (especially considering that such vector concatenation is quite a weird operation due to varying VLEN across RISC-V V implementations).

Tameem97 commented 11 months ago

Thanks @dzaima for answer. Yes, there are some other ways also but they are taking more intrinsics then I anticipated so this is the only thing which I got stuck with for LMUL<1