riscv-non-isa / rvv-intrinsic-doc

https://jira.riscv.org/browse/RVG-153
BSD 3-Clause "New" or "Revised" License
281 stars 89 forks source link

Fixes for the specification #292

Closed eopXD closed 9 months ago

eopXD commented 10 months ago

These changes will be cherry-picked to v1.0.x branch.

[0] https://en.cppreference.com/w/cpp/types/floating-point

eopXD commented 10 months ago

@Amanieu Please checkout the added notes to uninitialized/vundefined values in (https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/292/commits/68f125b818d5570f7bbc81182e6c413f1bda1d88).

eopXD commented 10 months ago

Rebased and added commit c33f99f that addresses Rogers comments.

topperc commented 10 months ago

I'm not sure I understand why vundefined intrinsics exist. I thought some of their original usage was due to some intrinsic signatures that took passthru operands before we defined explicit policy intrinsics. And we needed to use vundefined to get tail agnostic on those intrinsics. Hopefully that's been resolved with the current interface? If not we should fix the interface.

Some conversations I've had seem to suggest that vundefined could be used to suppress -Wunitialized by explicitly stating that the variable is undefined. But I don't think that was the original intent and I don't know if I would advocate for that usage.

nick-knight commented 10 months ago

@topperc One of my goals with the policy intrinsics was to eliminate the vundefined intrinsics.

I did not anticipate that they still had a use in register-group fusion and tuple construction via the vset intrinsics, as a way of suppressing -Wuninitialized. However, I think this usage is now unnecessary given the vcreate intrinsics: https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/256 https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288 There may be cases where we wish to incrementally populate a register group or tuple, where we might need to repeatedly pack with vcreate, unpack with vget, and then repack with another, larger vcreate. Perhaps this could be cleaner with vundefined. But it needn't impact code-gen.

I do think there's a little weirdness with the vlmul_ext intrinsics, some cases of which seem to emulate a vset or vcreate with vundefined arguments. That is, if the vundefined functionality is truly unnecessary, then couldn't (at least some of) the vlmul_ext intrinsics be dropped as well? Thinking about this further, perhaps only the fractional LMUL vlmul_ext and vlmul_trunc intrinsics are necessary.

dzaima commented 10 months ago

For the case of tuples or building a register group, vcreate does indeed pretty much replace vundefined+vget+vset, except of course the case where you do want undefined elements passed to the following operation. Though I don't know how often, if at all, that'd be a real need (something like, vmergeing a bunch of things together in a loop, and starting the loop with a vundefined? but you can just peel back the first iteration. A reasonably-realistic thing might be doing a vrgather where the two vector arguments don't have the same LMUL-SEW-ration (which they don't at all need to for reasonable functionality; of course it'd be better if RVV1.0 allowed specifying the LMUL the operands separately for this, but ¯\_(ツ)_/¯)).

But, for what it's worth, you can get a semantically-identical result to e.g. __riscv_vundefined_u32m1() with __riscv_vid_v_u32m1(0) (it's shorter even!), so undefined values stay regardless of if they're via a clear vundefined or not.

eopXD commented 9 months ago

@topperc @nick-knight @dzaima I think that "the redundancy of vundefined" should belong to a separate thread so we can converge this pull request. The note on "agnostic values" covers vundefined because it exists now in the specification. If any further decision is made the note will be adjusted.

@Amanieu Just updated the description. I realize its not about the stack or memory address the spec should be described, but the definition of what are the agnostic (indeterminate) values we are dealing with in the RVV C intrinsics.