rust-lang / rustc_codegen_cranelift

Cranelift based backend for rustc
Apache License 2.0
1.63k stars 101 forks source link

Unimplemented x86 llvm intrinsic vcvtph2ps #1545

Open DGriffin91 opened 1 day ago

DGriffin91 commented 1 day ago

When compiling half I noticed two warnings for a missing x86 llvm intrinsic vcvtph2ps:

warning: unsupported x86 llvm intrinsic llvm.x86.vcvtph2ps.128; replacing with trap
warning: unsupported x86 llvm intrinsic llvm.x86.vcvtph2ps.256; replacing with trap
trap at Instance { def: Item(DefId(2:23219 ~ core[4322]::core_arch::x86::f16c::_mm_cvtps_ph)), args: [0_i32] } (_ZN4core9core_arch3x864f16c12_mm_cvtps_ph17heb7efe16215ebb1fE): llvm.x86.vcvtps2ph.128

Thanks for all your work on the cranelift backend for rust!

Edit: I guess this is might be related to https://github.com/rust-lang/rustc_codegen_cranelift/issues/1461

bjorn3 commented 1 day ago

Support for f16 and f128 is basically non-existent in Cranelift so I can't implement these intrinsics the regular way of emulating it using scalar operations. Instead I did have to use inline asm to directly emit the respective x86 instructions, which makes compilation slower as it involves spawning another rustc instance with the LLVM backend to do the compilation of assembly for us. I have some ideas about how to handle this better in the future, but those ideas are not something that will be implemented within a couple of weeks.

tgross35 commented 1 day ago

(some more info at https://github.com/rust-lang/rustc_codegen_cranelift/issues/1461)

DGriffin91 commented 1 day ago

@bjorn3 Does Cranelift want to eventually support f16 and f128 more directly or is the asm route the plan for the foreseeable future?

bjorn3 commented 1 day ago

Cranelift will probably have to support it eventually, but the asm route will be much quicker.