MixmasterFresh / rust-on-gpu

A fork of the Rust Language for experimenting with GPU support.
https://www.rust-lang.org
Other
5 stars 0 forks source link

LLVM ERROR: Cannot select #8

Open japaric opened 8 years ago

japaric commented 8 years ago

When trying to compile the core crate for a nvptx target (but you have to fix or bypass #7 first):

$ rustc --target nvptx-unknown-unknown libcore/lib.rs
LLVM ERROR: Cannot select: 0x7f1bd76eb5a0: f32 = fcopysign ConstantFP:f32<1.000000e+00>, 0x7f1bd5620080
  0x7f1bd56202e0: f32 = ConstantFP<1.000000e+00>
  0x7f1bd5620080: f32,ch = CopyFromReg 0x7f1bd21274f0, Register:f32 %vreg3
    0x7f1bd6e743b0: f32 = Register %vreg3
In function: _ZN4core3f3250_$LT$impl$u20$core..num..Float$u20$for$u20$f32$GT$6signum17h5c5199388a8660d1E
japaric commented 8 years ago

reduced test case:

// copysign.rs
#![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]

extern "rust-intrinsic" {
    fn copysignf32(x: f32, y: f32) -> f32;
}

fn foo(x: f32, y: f32) -> f32 {
    unsafe {
        copysignf32(x, y)
    }
}

#[lang = "copy"]
trait Copy {}

#[lang = "sized"]
trait Sized {}
$ rustc --target nvptx-unknown-unknown --emit asm copysign.rs
LLVM ERROR: Cannot select: 0x7f78480a01e0: f32 = fcopysign 0x7f784809f990, 0x7f784809fbf0
  0x7f784809f990: f32,ch = load<LD4[null(addrspace=101)]> 0x7f7848062a60, TargetExternalSymbol:i32'_ZN3foo3foo17h0f1b025e65774396E_param_0', undef:i32
    0x7f784809f730: i32 = TargetExternalSymbol'_ZN3foo3foo17h0f1b025e65774396E_param_0'
    0x7f784809f860: i32 = undef
  0x7f784809fbf0: f32,ch = load<LD4[null(addrspace=101)]> 0x7f7848062a60, TargetExternalSymbol:i32'_ZN3foo3foo17h0f1b025e65774396E_param_1', undef:i32
    0x7f784809fac0: i32 = TargetExternalSymbol'_ZN3foo3foo17h0f1b025e65774396E_param_1'
    0x7f784809f860: i32 = undef
In function: _ZN3foo3foo17h0f1b025e65774396E
hanna-kruppe commented 8 years ago

It's not surprising that the PTX backend is specifically written for the kind of code nvcc generates, and thus doesn't support all of the LLVM features ever. The clean long-term solution would be to extend the instruction selection upstream. But to get started, one could fork LLVM and hack on it, or write an out-of-tree LLVM pass, to expand the problematic intrinsics to normal code. A MIR pass might even be easier.

However, LLVM canonicalizes some code patterns into intrinsic calls. If it does this with intrinsics that the backend doesn't support (seems unlikely but who knows), there's basically no choice but to extend the backend directly.