Closed RalfJung closed 2 months ago
I believe it's not UB in the current LLVM backend, but I'm not sure if that was intentional.
Yeah, the current backend does trunc
.
However, @workingjubilee mentioned future plans for other bitmask-taking intrinsics where this would be UB, so it might make sense to make it UB here as well.
Yes, specifically I was thinking about possibly executing masked loads with a bitvector. We could require it to implicitly trunc to the right size but it seems simpler in terms of codegen to just say the bitvector should probably specify reading the exact number of values you want, even if read as a larger number. That way if you just slam the number into a register and do the masked load, it always works correctly.
The docs for the intrinsic also explicitly say "padding bits must be all zero", so I think currently, this has a very clear answer -- out-of-bounds masks are UB.
If there's a motivation to change that, someone should file an issue spelling that out. :)
What should the semantics of the
simd_select_bitmask
intrinsic be when the bitmask has "out-of-bounds" bit set? This can happen when the vector length is less than 8, and the bitmask is stored as au8
.u8
to that function. So, we'd have to mask out the extra bits before passing them tosimd_select_bitmask
.The portable-simd test suite currently does not hit this case at all.