Open japaric opened 8 years ago
reduced test case:
// copysign.rs
#![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]
extern "rust-intrinsic" {
fn copysignf32(x: f32, y: f32) -> f32;
}
fn foo(x: f32, y: f32) -> f32 {
unsafe {
copysignf32(x, y)
}
}
#[lang = "copy"]
trait Copy {}
#[lang = "sized"]
trait Sized {}
$ rustc --target nvptx-unknown-unknown --emit asm copysign.rs
LLVM ERROR: Cannot select: 0x7f78480a01e0: f32 = fcopysign 0x7f784809f990, 0x7f784809fbf0
0x7f784809f990: f32,ch = load<LD4[null(addrspace=101)]> 0x7f7848062a60, TargetExternalSymbol:i32'_ZN3foo3foo17h0f1b025e65774396E_param_0', undef:i32
0x7f784809f730: i32 = TargetExternalSymbol'_ZN3foo3foo17h0f1b025e65774396E_param_0'
0x7f784809f860: i32 = undef
0x7f784809fbf0: f32,ch = load<LD4[null(addrspace=101)]> 0x7f7848062a60, TargetExternalSymbol:i32'_ZN3foo3foo17h0f1b025e65774396E_param_1', undef:i32
0x7f784809fac0: i32 = TargetExternalSymbol'_ZN3foo3foo17h0f1b025e65774396E_param_1'
0x7f784809f860: i32 = undef
In function: _ZN3foo3foo17h0f1b025e65774396E
It's not surprising that the PTX backend is specifically written for the kind of code nvcc generates, and thus doesn't support all of the LLVM features ever. The clean long-term solution would be to extend the instruction selection upstream. But to get started, one could fork LLVM and hack on it, or write an out-of-tree LLVM pass, to expand the problematic intrinsics to normal code. A MIR pass might even be easier.
However, LLVM canonicalizes some code patterns into intrinsic calls. If it does this with intrinsics that the backend doesn't support (seems unlikely but who knows), there's basically no choice but to extend the backend directly.
When trying to compile the
core
crate for a nvptx target (but you have to fix or bypass #7 first):