riscv-non-isa / riscv-c-api-doc

Documentation of the RISC-V C API
https://jira.riscv.org/browse/RVG-4
Creative Commons Attribution 4.0 International
68 stars 38 forks source link

Constraints for vector tuple types #43

Open nick-knight opened 1 year ago

nick-knight commented 1 year ago

How do we pass vector tuple types to/from extended asm templates? It seems that using the insertion/extraction intrinsics (with the first tuple element) might be unsafe.

@kito-cheng @leekillough

kito-cheng commented 1 year ago

It's safe to use vr constraint for tuple types, and compiler could recognized the type and use the right info, this could work on GCC trunk, but seems clang trunk will got ... crash.

#include <riscv_vector.h>

void foo(){
    vint32m1x2_t v1, v2; 
    asm volatile ("# %0 %1": "=vr"(v1) : "vr"(v2));
}
leekillough commented 1 year ago

But how to pass certain fields of tuples as non-tuple vector registers?

If (v0,v1) is a tuple called vx, how do I pass vx.v0 or vx.v1 to inline assembly or non-segment intrinsics?

Depositing/extracting vectors from tuple aggregate types seems to defeat the purpose of segment loads/stores, unless it's just massaging for the compiler and introduces no new instructions (moves).

kito-cheng commented 1 year ago

But how to pass certain fields of tuples as non-tuple vector registers?

If (v0,v1) is a tuple called vx, how do I pass vx.v0 or vx.v1 to inline assembly or non-segment intrinsics?

Depositing/extracting vectors from tuple aggregate types seems to defeat the purpose of segment loads/stores, unless it's just massaging for the compiler and introduces no new instructions (moves).

Yes, using vget/vset to depositing/extracting vectors from tuple types, compiler will try to allocate same register to prevent extra move instruction, if you saw a move instruction and you think it not necessary, you could report bug to llvm or GCC community since that might be potential performance regression issue.

leekillough commented 1 year ago

The tuple intrinsic type, since it's already a type outside of C/C++ proper, could have array indexing tuple[0 .. NFIELDS-1], and this would be a lot more straightforward. It would return an lvalue of a numbered field, and it would be a constraint violation to be outside of the range 0 .. NFIELDS-1 (or to use a value which isn't a compile-time constant).

Even if array subscripting is not practical, some intrinsic like __rvv_tuple_field() to return a numbered tuple field as an lvalue, which can be assigned to or converted to an rvalue, would be more intuitive than inserting or extracting, which sometimes requires creating extra variables that hopefully the compiler will merge with the tuples'.

Porting code which used the old syntax would also be a lot easier, since you would only need to replace things like xvec_real with xvec[0] or __rvv_tuple_field(xvec, 0), and xvec_imag with xvec[1] or __rvv_tuple_field(xvec, 1). It would work whether xvec[0] and xvec[1] ended up on the LHS or RHS of an assignment, and would not need to create new temporary local variables of vector type, or require the compiler to assign them to the tuple fields' same vector registers -- it would just access them directly.

kito-cheng commented 1 year ago

@leekillough honestly we've consider adding subscripting syntax for tuple type, I could imagining it would be useful and much simple for user - but unfortunately we are lack of engineering resource to implement that :(

leekillough commented 1 year ago

@leekillough honestly we've consider adding subscripting syntax for tuple type, I could imagining it would be useful and much simple for user - but unfortunately we are lack of engineering resource to implement that :(

Here's a preview of what not having such a feature would require doing, unless I'm missing something:

Use the new tuple intrinsics to get rid of build errors in X280 BLIS