but clang generates different code for them, less efficient for the vmsgeu case:
vmsgeu:
li a1, 100
vsetvli zero, a0, e32, m1, ta, ma
vmsltu.vx v8, v8, a1
vmnot.m v0, v8
ret
vmsgtu:
li a1, 99
vsetvli zero, a0, e32, m1, ta, ma
vmsgtu.vx v0, v8, a1
ret
Additionally, for a dynamic value known to not be the minimum value for the given integer type, it would presumably be better to decrement it in the GPRs than doing the mask negation, e.g. https://riscv.godbolt.org/z/14qPd643P (esp. if the decrement can be hoisted out of a loop).
The following two functions have identical behavior:
```c
#include<riscv_vector.h>
vbool32_t vmsgeu(vuint32m1_t op1, size_t vl) {
return __riscv_vmsgeu_vx_u32m1_b32(op1, 100, vl);
}
vbool32_t vmsgtu(vuint32m1_t op1, size_t vl) {
return __riscv_vmsgtu_vx_u32m1_b32(op1, 99, vl);
}
```
but clang generates different code for them, less efficient for the vmsgeu case:
```asm
vmsgeu:
li a1, 100
vsetvli zero, a0, e32, m1, ta, ma
vmsltu.vx v8, v8, a1
vmnot.m v0, v8
ret
vmsgtu:
li a1, 99
vsetvli zero, a0, e32, m1, ta, ma
vmsgtu.vx v0, v8, a1
ret
```
https://riscv.godbolt.org/z/63rTTanMY
Additionally, for a dynamic value known to not be the minimum value for the given integer type, it would presumably be better to decrement it in the GPRs than doing the mask negation, e.g. https://riscv.godbolt.org/z/14qPd643P (esp. if the decrement can be hoisted out of a loop).
The following two functions have identical behavior:
but clang generates different code for them, less efficient for the vmsgeu case:
https://riscv.godbolt.org/z/63rTTanMY
Additionally, for a dynamic value known to not be the minimum value for the given integer type, it would presumably be better to decrement it in the GPRs than doing the mask negation, e.g. https://riscv.godbolt.org/z/14qPd643P (esp. if the decrement can be hoisted out of a loop).