Closed tmcdonell closed 8 years ago
That error should correlate with some specific construct in the input file. Could you try to narrow it down please, and attach it directly to this bug?
This function will trigger it:
; ModuleID = 'libdevice.compute_30.10.bc'
target triple = "nvptx-unknown-unknown"
@"$str" = private addrspace(1) constant [11 x i8] c"__CUDA_FTZ\00"
; Function Attrs: alwaysinline inlinehint
define float @__nv_floorf(float %f) #0 {
%call = call i32 @__nvvm_reflect(i8* addrspacecast (i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0) to i8*))
%1 = icmp ne i32 %call, 0
br i1 %1, label %2, label %4
; <label>:2 ; preds = %0
%3 = call float @llvm.nvvm.floor.ftz.f(float %f)
br label %6
; <label>:4 ; preds = %0
%5 = call float @llvm.nvvm.floor.f(float %f)
br label %6
; <label>:6 ; preds = %4, %2
%retval.0 = phi float [ %3, %2 ], [ %5, %4 ]
ret float %retval.0
}
declare i32 @__nvvm_reflect(i8*)
; Function Attrs: nounwind readnone
declare float @llvm.nvvm.floor.ftz.f(float) #1
; Function Attrs: nounwind readnone
declare float @llvm.nvvm.floor.f(float) #1
attributes #1 = { nounwind readnone }
Example pruned down slightly more. This is indeed the first call to __nvvm_reflect
in the file.
; ModuleID = 'libdevice.compute_30.10.bc'
target triple = "nvptx-unknown-unknown"
@"$str" = private addrspace(1) constant [11 x i8] c"__CUDA_FTZ\00"
; Function Attrs: alwaysinline inlinehint
define float @__nv_floorf(float %f) #0 {
%call = call i32 @__nvvm_reflect(i8* addrspacecast (i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0) to i8*))
ret float 0x00000000
}
declare i32 @__nvvm_reflect(i8*)
Apologies for not submitting the small test case initially.
Not sure if this will help you, but in the last working version that I could find (CUDA 6.5) that call line instead looked like this:
%1 = call i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0))
%call = call i32 @__nvvm_reflect(i8* %1)
plus:
declare i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)*) #1
The problem is likely the addrspacecast instruction, or rather the corresponding constant expression. The instruction's new in this llvm 3.5.
Should be fixed in llvm-general-pure-3.5.1.0 / llvm-general-3.5.1.2 Please re-open if not.
Everything seems to be working in the new version. Thanks!
I'm getting the following error when attempting
moduleAST
:The module in question that is hitting this problem is the math library that you need to link against when targeting PTX. I've uploaded a copy here for convenience, but you can also get it by installing the CUDA toolkit available from NVIDIA here (version 7.5).
Here is a small example to trigger the error:
Tested with llvm-general-3.5.1.1 and llvm-general-3.4.6.0.