The following .ll produces to 16 scalar ops with control flow. Adding an intermediate cast to a 32-bit integer behaves as expected (does not scalarize), but that doesn't seem like it should be helpful. Adding +nontrapping-fptoint fixes it too, but IIUC fptoui is a poison value on overflow, so trying to lower it to something with overflow checking can't work. There may have been transformations already made to the llvm IR that are only correct for non-poison in-bounds values, and these transformations may have mapped those out-of-range poison values back in-range in a way that dodges the overflow checks.
; llc wasm_float_cast.ll -mtriple=wasm32-unknown--wasm -mattr=+simd128 -o -
define void @test(ptr noalias nocapture noundef readonly %in, ptr noalias nocapture noundef writeonly %out) {
entry:
%fv.0.copyload = load <16 x float>, ptr %in, align 16
%conv = fptoui <16 x float> %fv.0.copyload to <16 x i8>
store <16 x i8> %conv, ptr %out, align 16
ret void
}
The following .ll produces to 16 scalar ops with control flow. Adding an intermediate cast to a 32-bit integer behaves as expected (does not scalarize), but that doesn't seem like it should be helpful. Adding +nontrapping-fptoint fixes it too, but IIUC fptoui is a poison value on overflow, so trying to lower it to something with overflow checking can't work. There may have been transformations already made to the llvm IR that are only correct for non-poison in-bounds values, and these transformations may have mapped those out-of-range poison values back in-range in a way that dodges the overflow checks.