denzp / rust-ptx-linker

The missing puzzle piece for NVPTX experience with Rust
MIT License
53 stars 11 forks source link

Internal Compiler Error when compiling kernel that uses fsin32/fcos32 intrinsics #20

Open bheisler opened 6 years ago

bheisler commented 6 years ago

I added a call to core::intrinsics::fsin32 to the chapter 1 example, and used a syncthreads to ensure that it couldn't be optimized away. The linker produces this error when compiling the modified kernel:

[PTX]    Compiling proxy v0.0.0 (file:///C:/Users/Brook/AppData/Local/Temp/ptx-builder-0.4/chapter_1_kernel/2777747bd38bda2e)
[PTX] error: linking with `ptx-linker` failed: exit code: 1
[PTX]   = note:  [INFO] Going to link 2 bitcode modules and 6 rlibs...
[PTX]
[PTX]           [DEBUG] Linking bitcode: "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder-0.4\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps\\proxy.3wp5o7r8sftxbtji.rcgu.o"
<-- Snip -->
[PTX]           [DEBUG]   - linking archive item: "compiler_builtins-54267958a4f42a84.it3wtu5gavdx124.rcgu.o"
[PTX]            [INFO] Linking with Link Time Optimisation
[PTX]           LLVM ERROR: Cannot select: 0x2f13b7a31f0: f32 = fsin 0x2f13b3b5508
[PTX]             0x2f13b3b5508: f32 = fp_round 0x2f13b7a2968, TargetConstant:i64<0>
[PTX]               0x2f13b7a2968: f64,ch = CopyFromReg 0x2f13bde5b68, Register:f64 %28
[PTX]                 0x2f13b7a2b08: f64 = Register %28
[PTX]               0x2f13b3b5438: i64 = TargetConstant<0>
[PTX]           In function: bilateral_filter

The PTX instruction set does define sin32 and cos32 instructions, so I would expect those to be selected, or at least a better error message to be provided.

denzp commented 6 years ago

Looks like it's more or less intended. But I'm not sure whether we should enable "unsafe FP math" by default.

bheisler commented 6 years ago

I see. I'd assumed that the sin/cos instructions were compatible, but I guess not. Is it possible to give a clearer error message in this case?

Unfortunately, Rust doesn't seem to provide any reasonable way to pass -ffast-math. That means that users who do want the "unsafe" FP math equivalents cannot do so without resorting to inline assembly.