Closed camel-cdr closed 1 year ago
__riscv_vmfne_vv_f64m1_b64(vec_a, vec_zero, vl);
in rvv_reduce.c
could also be replaced with __riscv_vmfne_vf_f64m1_b64
Nit: May you add "Resolve #XXX" into the commit messages so they can be interlinked with the issue?
@eopXD Is it important that it's in the correct commit, or can just add that in a commit renaming branch to branch_vec? I haven't done a rebase & amend & force push with an existing PR, so idk how that would behave.
@eopXD Is it important that it's in the correct commit, or can just add that in a commit renaming branch to branch_vec? I haven't done a rebase & amend & force push with an existing PR, so idk how that would behave.
Yes, please rebase and amend it then force push again. Thank you.
The only objection I have is that I don't find abbreviating src/dst to s/d very informative.
I haven't addressed this yet, what name would you suggest?
I've also wondered about the choices of LMUL, was the idea to show a few different ones? Because usually maximizing LMUL given the number of required registers seems to be the best option.
The only objection I have is that I don't find abbreviating src/dst to s/d very informative.
I haven't addressed this yet, what name would you suggest?
Welp, maybe input_src
/input_dst
for parameter and src
/ dst
for iteration?
I've also wondered about the choices of LMUL, was the idea to show a few different ones? Because usually maximizing LMUL given the number of required registers seems to be the best option.
Yes the intent was to show different LMUL's.
Ok, I've now used source
, destination
, src
, dst
.
This resolves https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/252, https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/244, unsigned char instead of char, and cleans up the code a bit.
The problem with using
char
in combination withvint8mX_t
is thatchar
can be either signed or unsigned, so to remedy this problem, I've changed the examples to useunsigned char
explicitly (andvuint8m8_t
).rvv_reduce.c had an error, where a agnostic tail policy could make it produce the wrong results, this has been fixed by using the
_tu
intrinsic variant.