riscv-non-isa / riscv-elf-psabi-doc

A RISC-V ELF psABI Document
https://jira.riscv.org/browse/RVG-4
Creative Commons Attribution 4.0 International
691 stars 163 forks source link

Specify the Calling Convention for Fixed-Length Vectors #406

Closed kito-cheng closed 10 months ago

kito-cheng commented 10 months ago

Previously, there was no mention of fixed-length vectors in the psABI. Unfortunately, GCC and Clang have implemented support for fixed-length vectors differently for a long time. We should address this soon, as the RISC-V ecosystem is expanding and optimizing programs with fixed-length vectors, especially with the upcoming vector extension, is becoming more common.

There are two options for passing arguments with fixed-length vectors:

  1. The first is passing as an array type, which always involves passing by reference.
  2. The second is passing as a struct type. This may involve passing the value by reference, integer registers, or floating-point registers, depending on the size of the fixed-length vector, similar to a struct.

GCC uses the first method, while Clang uses the second.

The drawback of the second option is the extra overhead required when generating code with vector instructions.

Consider the following code as an example. GCC passes a and b by reference, using the a0 and a1 registers to pass their addresses. Clang, however, passes a and b through integer registers: using a0 and a1 to pass a; a2 and a3 to pass b.

typedef int int32x4_t __attribute__((vector_size(16)));

int32x4_t foo(int32x4_t a, int32x4_t b) {
  int32x4_t ret = a + b;
  return ret;
}

The code generation with vector extension for pass-by-reference is straightforward: load the value from the pointer, perform the operation, and then store the result.

_Z3fooDv4_iS_:
        vsetivli        zero,4,e32,m1,ta,ma
        vle32.v v1,0(a1)
        vle32.v v2,0(a2)
        vadd.vv v1,v1,v2
        vse32.v v1,0(a0)
        ret

However, the code generation with the vector extension for the struct approach is quite complicated. It might use vslide1down.vx or vmv.v.s to move data between integer and vector registers, which can generally result in poor performance.

_Z3fooDv4_iS_:
        vsetivli        zero, 2, e64, m1, ta, ma
        vslide1down.vx  v8, v8, a0
        vslide1down.vx  v8, v8, a1
        vslide1down.vx  v9, v8, a2
        vslide1down.vx  v9, v9, a3
        vsetivli        zero, 4, e32, m1, ta, ma
        vadd.vv v8, v9, v8
        vsetivli        zero, 1, e64, m1, ta, ma
        vmv.x.s a0, v8
        vslidedown.vi   v8, v8, 1
        vmv.x.s a1, v8
        ret

Therefore, this PR proposes that fixed-length vectors should always be passed by reference.

We also plan to submit another proposal to address the issue of passing fixed-length vectors via vector registers.

kito-cheng commented 10 months ago

cc. @palmer-dabbelt @JeffreyALaw @preames @topperc @rofirrim

zhongjuzhe commented 10 months ago

I think @lhtin should be aware of this

lhtin commented 10 months ago

Therefore, this PR proposes that fixed-length vectors should always be passed by reference.

We also plan to submit another proposal to address the issue of passing fixed-length vectors via vector registers.

Why are there two proposals for fixed-length vectors? My suggestion is to explicitly forbid the passing of fixed-length vectors first, thus, it will be compatible with the proposal that passes fixed-length vectors via vector registers. Thus, it is guaranteed that there will be no compatibility problems when a new proposal is put forward later.

For example, one part of the program is compiled before the new proposal is made, and the other part is compiled by the compiler that used the new proposal. If fixed-length vector was allowed to pass through pointers before, and now it is passed through vector registers, then there will be a problem that callers and callees maybe not aligned if they are not compiled at the same time.

kito-cheng commented 10 months ago

Why are there two proposals for fixed-length vectors? My suggestion is to explicitly forbid the passing of fixed-length vectors first

The problem is that will become a hard ABI breakage and also make fixed length vector not able used without vector extension - some project will be broken immediately since rv64gc still default for most RISC-V Linux distribution.

it will be compatible with the proposal that passes fixed-length vectors via vector registers. Thus, it is guaranteed that there will be no compatibility problems when a new proposal is put forward later.

My thought to made it require explicitly function attribute to make it passed via vector register - not ideal, but compatible with existing ABI.

kito-cheng commented 10 months ago

Few more description why we did kind of hard break to scalable vector but not doing same approach for fixed-length vectors: scalable is only a available when vector extension is present, but fixed-length vector can be used even vector extension is not available, it would cause much more problem to make it incompatible.

kito-cheng commented 10 months ago

Quick note from LLVM sync up meeting: it's GCC bug, we should fix on GCC land, it will passing in ref when vector enabled.

kito-cheng commented 10 months ago

Drop this PR, will create another PR to clarify fixed length vector should follow aggregate's rule