Closed eopXD closed 10 months ago
I think for some things (maskedoff
especially, mask
, unary ops, cases where the types make clear the meaning) it could make sense. But for non-commutative things I think it'd be unnecessarily confusing, e.g. I believe a subtract would end up as vint32m1_t __riscv_vsub_vv_i32m1(vint32m1_t vs2, vint32m1_t vs1, size_t vl)
, where the vs2
before vs1
doesn't really make sense and might incorrectly make one think it'd do vs1-vs2
. (I guess the assembly in the V spec names/orders them as such to be able to place the register/immediate arg in the rs1
spot, but this doesn't at all affect users of intrinsics)
I don't think intrinsics users will necessarily know assembly (though of course a good portion will), much less have read the spec which is the main place that currently uses vd
/vs1
/vs2
(in disassembly you'd just see some vop.vv v0, v1, v2
, which could be either vop vd, vs2, vs1
or vop vd, vs1, vs2
depending on the op).
I'm possibly not a representative sample of intrinsics users, but from my usage of x86 and, to a lesser extent, ARM NEON intrinsics, I still sometimes don't know the x86 assembly arguments, and have basically no knowledge on ARM NEON assembly (though, to be fair, those two are more complicated than RVV, and if the intrinsics informed you of the mapping, it could be easy to learn).
I think for some things (
maskedoff
especially,mask
, unary ops, cases where the types make clear the meaning) it could make sense. But for non-commutative things I think it'd be unnecessarily confusing, e.g. I believe a subtract would end up asvint32m1_t __riscv_vsub_vv_i32m1(vint32m1_t vs2, vint32m1_t vs1, size_t vl)
, where thevs2
beforevs1
doesn't really make sense and might incorrectly make one think it'd dovs1-vs2
. (I guess the assembly in the V spec names/orders them as such to be able to place the register/immediate arg in thers1
spot, but this doesn't at all affect users of intrinsics)
Comparing to the existing naming of op1
and op2
(vint32m1_t __riscv_vsub_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t vl);
) pointing the parameters conventions to assembly mnemonics is better than nothing.
I don't think intrinsics users will necessarily know assembly (though of course a good portion will), much less have read the spec which is the main place that currently uses
vd
/vs1
/vs2
(in disassembly you'd just see somevop.vv v0, v1, v2
, which could be eithervop vd, vs2, vs1
orvop vd, vs1, vs2
depending on the op).
If it is safe to say RVV intrinsic users aim to write vector code with that exact semantic of that specific RVV instruction, then I think the user should at least have a sense of what assembly is. The intrinsics specification will point users to the V extension specification for reference, as it already does while explaining the control to the vector programming model.
v-spec also got misaligned parts, some places got vd, vs2, vs1
while other places got vd, vs1, vs2
, but I think the intrinsics specification should not introduce other ambiguity and respect whatever is in the V extension.
Comparing to the existing naming of
op1
andop2
(vint32m1_t __riscv_vsub_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t vl);
) pointing the parameters conventions to assembly mnemonics is better than nothing.
It could be said that the current op1
and op2
point to the n-th operand, i.e. vsub.vv vd, op1, op2
in disassembly, which is a direct mapping to a thing that everyone can utilize, whereas vs1
/vs2
are things only within the spec document, which, after enough time, will be one of many sources on RVV. I'm kind of pushing this primarily because, at the start of having skimmed the spec, for a decent bit of time (until I got clang & qemu working & triple-checked the behavior), I did actually think that vsub.vv vd, vs2, vs1
might do vd = vs1-vs2
and the name ordering was weird specifically to point out that the arguments are in reverse order, and was thinking whether the intrinsics reorder them back or not; it's possible that others might not fall in this pitfall.
that exact semantic of that specific RVV instruction
For some of the more unusual operations you may have to look up stuff, but for integer/float arith & unit-stride load/store (i.e. enough ops for most things) it's very much feasible & reasonable to just guess what they're gonna do.
I think the user should at least have a sense of what assembly is. The intrinsics specification will point users to the V extension specification for reference
Many used to ARM/Intel specifications might just automatically go looking for some other resource. The RVV spec happens to be quite readable, but even then purpose-made tutorials are gonna be even more approachable, and might not ever mention vs2
/vs1
(e.g. always using numbered registers or using a/b/c/x/y/z).
respect whatever is in the V extension
That is fair, and it could certainly add to confusion that the intrinsics and the V spec document use different conventions, but I think it's also a fair option to diverge to hide a confusing little detail whose reason doesn't affect intrinsics.
At the core here I guess is the target audience of intrinsics. Someone writing code for x86 & ARM & RVV probably won't dive too deep into any single one of them, and would heavily benefit from the intrinsics being self-contained and easily skimmable & immediately understandable, while someone focused on RVV & RISC-V specifically could benefit from uniformity across conventions.
And then there's the question of who this specific document is for. If that goal is advanced users, some less-advanced-user-friendly thing (incl. my viewer, which I want to be easily usable by anyone who knows C) could also just go with a different naming scheme.
Is the vl
argument really VL or is it AVL? Is it required to be <= VLMAX?
It serves the same role as "AVL" in the vset{i}vl{i}
(assembly) instructions.
This is recently addressed in #271.
This proposal does not functionally change any behavior of the RVV intrinsics. This proposal hopes to adjust the naming scheme of the parameter in the intrinsic functions to avoid ambiguity in the specification.
Introduction
Like Rich has raised questions of the
maskedoff
parameter passed to theTUMU
variants in this week's meeting, and also other issues like #258 has raised confusion thatmaskedoff
parameter is presented in an unmasked intrinsic serving as the passthrough register for tail elements.Browsing through the list of intrinsic prototypes the specification provides, the specification currently have several different naming-s to the intrinsic function parameters.
Listing a few immediate examples here for the rich naming-s we have for the parameters.
For example, for load intrinsics we have the following, and the store intrinsics with similar naming.
Intrinsics that map to computation that already has
vd
as input operand in its instruction semantics keep theirvd
naming. On the other hand, intrinsics that do not takevd
when policy is not specified as undisturbed, the passthrough operand is namedmaskedoff
.vslideup
intrinsics have the destination register named assrc
anddest
.Intrinsics that maps to instructions that are non-commutable have naming-s like
op1
,op2
. The current specification does not explain clearly on how they map to its assembly level corresponding (vs2
andvs1
/rs1
).Reduction intrinsics have naming like
vector
/scalar
.Proposal
Adjust the intrinsic parameter names to be identical to what is specified in the V extension specification. Using names like
vd
,vs2
,vs1
,rs1
,vm
.The benefit of this rename is that users can be more clear how their RVV value in the intrinsics will be mapped to its corresponding assembly form.
The fallback of this rename is that users will need to check out the V extension specification to understand when to be clear of what they are controlling.
However I think that given the intrinsic users should be advanced users that are aware that they are leveraging low level (assembly level) semantic of the RISC-V V extension in the C language level, I think it is worth the tradeoff for the intrinsic specification to depend on the naming of the V extension specification, and to not invent its own naming.
Note
The corner case of this proposed naming scheme is the fault-only first load instructions, where
vl
will be modified in the instruction. In the intrinsics, we pass a pointernew_vl
for the new vl to be returned by the intrinsic function. For this part, I think we can keep it the same.