bscarlet / llvm-general

Rich LLVM bindings for Haskell (with transfer of LLVM IR to and from C++, detailed compilation pass control, etc.)
http://hackage.haskell.org/package/llvm-general
132 stars 38 forks source link

Internal error in Constant.hs #162

Closed tmcdonell closed 8 years ago

tmcdonell commented 8 years ago

I'm getting the following error when attempting moduleAST:

*** Exception: src/LLVM/General/Internal/Constant.hs:(197,13)-(239,14): Non-exhaustive patterns in case

The module in question that is hitting this problem is the math library that you need to link against when targeting PTX. I've uploaded a copy here for convenience, but you can also get it by installing the CUDA toolkit available from NVIDIA here (version 7.5).

Here is a small example to trigger the error:

import LLVM.General
import LLVM.General.Context
import Control.Monad.Except

exceptError :: Show e => ExceptT e IO a -> IO a
exceptError = either (error . show) return <=< runExceptT

main :: IO ()
main = do
  ll <- readFile "libdevice.compute_30.10.ll"

  withContext                                     $ \ctx -> do
  exceptError $ withModuleFromLLVMAssembly ctx ll $ \mdl -> do
    print =<< moduleAST mdl

Tested with llvm-general-3.5.1.1 and llvm-general-3.4.6.0.

bscarlet commented 8 years ago

That error should correlate with some specific construct in the input file. Could you try to narrow it down please, and attach it directly to this bug?

tmcdonell commented 8 years ago

This function will trigger it:

; ModuleID = 'libdevice.compute_30.10.bc'
target triple = "nvptx-unknown-unknown"

@"$str" = private addrspace(1) constant [11 x i8] c"__CUDA_FTZ\00"

; Function Attrs: alwaysinline inlinehint
define float @__nv_floorf(float %f) #0 {
  %call = call i32 @__nvvm_reflect(i8* addrspacecast (i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0) to i8*))
  %1 = icmp ne i32 %call, 0
  br i1 %1, label %2, label %4

; <label>:2                                       ; preds = %0
  %3 = call float @llvm.nvvm.floor.ftz.f(float %f)
  br label %6

; <label>:4                                       ; preds = %0
  %5 = call float @llvm.nvvm.floor.f(float %f)
  br label %6

; <label>:6                                       ; preds = %4, %2
  %retval.0 = phi float [ %3, %2 ], [ %5, %4 ]
  ret float %retval.0
}

declare i32 @__nvvm_reflect(i8*)

; Function Attrs: nounwind readnone
declare float @llvm.nvvm.floor.ftz.f(float) #1

; Function Attrs: nounwind readnone
declare float @llvm.nvvm.floor.f(float) #1

attributes #1 = { nounwind readnone }
tmcdonell commented 8 years ago

Example pruned down slightly more. This is indeed the first call to __nvvm_reflect in the file.

; ModuleID = 'libdevice.compute_30.10.bc'
target triple = "nvptx-unknown-unknown"

@"$str" = private addrspace(1) constant [11 x i8] c"__CUDA_FTZ\00"

; Function Attrs: alwaysinline inlinehint
define float @__nv_floorf(float %f) #0 {
  %call = call i32 @__nvvm_reflect(i8* addrspacecast (i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0) to i8*))

  ret float 0x00000000
}

declare i32 @__nvvm_reflect(i8*)

Apologies for not submitting the small test case initially.

tmcdonell commented 8 years ago

Not sure if this will help you, but in the last working version that I could find (CUDA 6.5) that call line instead looked like this:

  %1 = call i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)* getelementptr inbounds ([11 x i8] addrspace(1)* @"$str", i32 0, i32 0))
  %call = call i32 @__nvvm_reflect(i8* %1)

plus:

declare i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)*) #1
bscarlet commented 8 years ago

The problem is likely the addrspacecast instruction, or rather the corresponding constant expression. The instruction's new in this llvm 3.5.

bscarlet commented 8 years ago

Should be fixed in llvm-general-pure-3.5.1.0 / llvm-general-3.5.1.2 Please re-open if not.

tmcdonell commented 8 years ago

Everything seems to be working in the new version. Thanks!