Closed eopXD closed 9 months ago
@Amanieu Please checkout the added notes to uninitialized/vundefined
values in (https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/292/commits/68f125b818d5570f7bbc81182e6c413f1bda1d88).
I'm not sure I understand why vundefined intrinsics exist. I thought some of their original usage was due to some intrinsic signatures that took passthru operands before we defined explicit policy intrinsics. And we needed to use vundefined to get tail agnostic on those intrinsics. Hopefully that's been resolved with the current interface? If not we should fix the interface.
Some conversations I've had seem to suggest that vundefined could be used to suppress -Wunitialized by explicitly stating that the variable is undefined. But I don't think that was the original intent and I don't know if I would advocate for that usage.
@topperc One of my goals with the policy intrinsics was to eliminate the vundefined
intrinsics.
I did not anticipate that they still had a use in register-group fusion and tuple construction via the vset
intrinsics, as a way of suppressing -Wuninitialized
. However, I think this usage is now unnecessary given the vcreate
intrinsics:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/256
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288
There may be cases where we wish to incrementally populate a register group or tuple, where we might need to repeatedly pack with vcreate
, unpack with vget
, and then repack with another, larger vcreate
. Perhaps this could be cleaner with vundefined
. But it needn't impact code-gen.
I do think there's a little weirdness with the vlmul_ext
intrinsics, some cases of which seem to emulate a vset
or vcreate
with vundefined
arguments. That is, if the vundefined
functionality is truly unnecessary, then couldn't (at least some of) the vlmul_ext
intrinsics be dropped as well? Thinking about this further, perhaps only the fractional LMUL vlmul_ext
and vlmul_trunc
intrinsics are necessary.
For the case of tuples or building a register group, vcreate
does indeed pretty much replace vundefined
+vget
+vset
, except of course the case where you do want undefined elements passed to the following operation. Though I don't know how often, if at all, that'd be a real need (something like, vmerge
ing a bunch of things together in a loop, and starting the loop with a vundefined
? but you can just peel back the first iteration. A reasonably-realistic thing might be doing a vrgather
where the two vector arguments don't have the same LMUL-SEW-ration (which they don't at all need to for reasonable functionality; of course it'd be better if RVV1.0 allowed specifying the LMUL the operands separately for this, but ¯\_(ツ)_/¯)).
But, for what it's worth, you can get a semantically-identical result to e.g. __riscv_vundefined_u32m1()
with __riscv_vid_v_u32m1(0)
(it's shorter even!), so undefined values stay regardless of if they're via a clear vundefined
or not.
@topperc @nick-knight @dzaima
I think that "the redundancy of vundefined
" should belong to a separate thread so we can converge this pull request. The note on "agnostic values" covers vundefined
because it exists now in the specification. If any further decision is made the note will be adjusted.
@Amanieu Just updated the description. I realize its not about the stack or memory address the spec should be described, but the definition of what are the agnostic (indeterminate) values we are dealing with in the RVV C intrinsics.
_Float16
/float
/double
for scalar types of floating-point types with width of 16/32/64. Although we have<stdfloat.h>
[0] introduced in C++23, this is not fully supported in the upstream compilers (LLVM/GCC). This is fixed by leveraging theTypeHelper
class instead of deriving locally in the templates.These changes will be cherry-picked to
v1.0.x
branch.[0] https://en.cppreference.com/w/cpp/types/floating-point