Closed Incarnation-p-lee closed 1 year ago
Yes, you can use the vfcvt
family to vectorize the lrint
family. You may need the widening and narrowing flavors. And keep in mind that long
is 32 bits in the ILP32 ABI.
The rint
family is trickier. (A vector analogue of Zfa might help.)
Yes, you can use the
vfcvt
family to vectorize thelrint
family. You may need the widening and narrowing flavors. And keep in mind thatlong
is 32 bits in the ILP32 ABI.The
rint
family is trickier. (A vector analogue of Zfa might help.)
Thanks @nick-knight for the confirmation. Yes, return long
has different sizes for ilp32 and lp64, while return int
and return long long
don't have a similar issue here.
According to widening and narrowing, for example, lrintf16, aka F16 to INT64. I suppose there will be at least 2 options here. Do you have any suggestions here? Thanks again for help.
option 1:
FP16 => FP32
FP32 => INT64
option 2:
FP16 => INT32
INT32 => INT64
What behavior do you want for exceptional inputs (Infs and Nans)? My understanding is it's implementation defined, for the lrint
family.
What behavior do you want for exceptional inputs (Infs and Nans)? My understanding is it's implementation defined, for the
lrint
family.
Yes, it is. The manual of lrint
family indicates the return value of exceptional inputs (INF, NAN, or too large) is unspecified. Then it looks like both options are correct.
Thanks, nick, closed this issue as no more questions now.
Consider we have code as below, invoking __builtin_lrint (aka lrint in math.h)
For the
test_lrint_scalar
, we may generate asm with-ffast-math
as below.But for
test_lrint_vec
, I suppose it is possible to leveragevfcvt.x.f.v v,v
for vectorization. AFAIK, the semantics of cvt from FP to INT should be almost the same between scalar and vec. Then we may have here.