Subtly incorrect logic in `Text.LLVM.call` ?

alpmestan commented 2 months ago

Hello,

First off, thanks a lot for this package.

I stumbled upon some odd behaviour (possibly related to https://github.com/GaloisInc/llvm-pretty/issues/102#issuecomment-1478073111 ?). When I try to call a function that I either declared or defined in the same LLVM monad computation, using the Text.LLVM.call function, the function type is printed with an excessive pointer, see below.

mod1 = do
  printf <- L.declare (L.iT 32) "printf" [L.ptrT (L.iT 8)] True
  fmtStr <- L.string "d_fmt" "%lf\n\0"
  L.define L.emptyFunAttrs (L.iT 32) "main" () $ do
    _ <- L.call printf [fmtStr, doubleT L.-: L.toValue (pi :: Double)]
    L.ret (L.iT 32 L.-: (0 :: Int32))

This results in:

# test1.ll
target triple = "aarch64-apple-macosx-unknown-"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
@d_fmt = constant [5 x i8] [i8 37, i8 108, i8 102, i8 10, i8 0]
declare i32 @printf(i8*, ...)
define i32 @main() {
  %r0 = call i32(i8*, ...)* @printf([5 x i8]* @d_fmt,
                                    double 3.141592653589793)
  ret i32 0
}

Notice the * at the end of i32(i8*, ...)*, which shouldn't be there. This IR, when executed with lli:

$ lli test1.ll
0.000000

If I edit the IR manually to remove that extra pointer in the type, and try to run again:

# test2.ll
target triple = "aarch64-apple-macosx-unknown-"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
@d_fmt = constant [5 x i8] [i8 37, i8 108, i8 102, i8 10, i8 0]
declare i32 @printf(i8*, ...)
define i32 @main() {
  %r0 = call i32 (i8*, ...) @printf([5 x i8]* @d_fmt,
                                    double 3.141592653589793)
  ret i32 0
}

$ lli test2.ll
3.141593

I tracked down the issue to the definition of call:

call :: IsValue a => Typed a -> [Typed Value] -> BB (Typed Value)
call sym vs = case typedType sym of
  ty@(PtrTo (FunTy rty _ _)) -> observe rty (Call False ty (toValue sym) vs)
  _                          -> error "invalid function type given to call"

If I instead define and use the following function in place of Text.LLVM.call:

kall :: L.IsValue a => L.Typed a -> [L.Typed L.Value] -> L.BB (L.Typed L.Value)
kall sym vs = case L.typedType sym of
  L.PtrTo ty@(L.FunTy rty _ _) -> L.observe rty (L.Call False ty (L.toValue sym) vs)
  _                          -> error "invalid function type given to call"

then I get the correct IR (the one in test2.ll) and some digits of pi get printed, like with test2.ll.

Am I somehow misusing define/declare/call or should call be patched?

Cheers

kquick commented 2 months ago

Thanks for the report. As noted in our README, LLVM instructions and syntax can change with different versions, and indeed this seems to have been a change that started in LLVM 14 and was finalized in LLVM17.

I embedded your sample in the following test harness (targeting my local machine's triple, although that's just for lli needs and should not be relevant to this issue):

{-# LANGUAGE OverloadedStrings #-}

import qualified Text.LLVM as L
import qualified Text.LLVM.Triple.AST as L3
import qualified Text.LLVM.PP as LP
import Data.Int
import Text.Read ( readMaybe )
import System.Environment ( getArgs )

mod1 :: L.LLVM (L.Typed L.Value)
mod1 = do
  printf <- L.declare (L.iT 32) "printf" [L.ptrT (L.iT 8)] True
  fmtStr <- L.string "d_fmt" "%lf\n\0"
  L.define L.emptyFunAttrs (L.iT 32) "main" () $ do
    _ <- L.call printf [fmtStr, doubleT L.-: L.toValue (pi :: Double)]
    L.ret (L.iT 32 L.-: (0 :: Int32))

doubleT :: L.Type
doubleT = L.PrimType $ L.FloatType L.Double

main = do args <- getArgs
          let llvmver = if length args < 1
                        then LP.llvmVlatest
                        else case (readMaybe $ args !! 0) :: Maybe Int of
                               Just n -> n
                               Nothing -> LP.llvmVlatest
          let tgtTriple = L3.TargetTriple
                          {
                            L3.ttArch = L3.X86_64
                          , L3.ttSubArch = mempty
                          , L3.ttVendor = L3.PC
                          , L3.ttOS = L3.Linux
                          , L3.ttEnv = L3.GNU
                          , L3.ttObjFmt = L3.ELF
                          }
          let llvmMod = (snd $ L.runLLVM mod1) { L.modTriple = tgtTriple }
          -- putStrLn $ show llvmMod
          putStrLn $ "---- for LLVM " <> show llvmver
          let ll = show $ LP.ppLLVM llvmver $ LP.llvmPP llvmMod
          -- putStrLn ll
          writeFile "calltest.ll" ll
          putStrLn "wrote calltest.ll"

This matches your first form (no trailing * for the call fnty argument) and I get the following results:

$ for X in $(seq 12 18) ; do echo LLVM$X:: ; runhaskell calltest.hs ; nix shell nixpkgs#llvm_$X -c lli alp.ll ; done
LLVM12::
---- for LLVM 12
wrote calltest.ll
lli: calltest.ll:5:29: error: '@printf' defined with type 'i32 (i8*, ...)*' but expected 'i32 (i8*, ..
.)* ([5 x i8]*, double)*'
  %r0 = call i32(i8*, ...)* @printf([5 x i8]* @d_fmt,
                            ^
LLVM13::
---- for LLVM 13
wrote calltest.ll
lli: lli: calltest.ll:5:29: error: '@printf' defined with type 'i32 (i8*, ...)*' but expected 'i32 (i8
*, ...)* ([5 x i8]*, double)*'
  %r0 = call i32(i8*, ...)* @printf([5 x i8]* @d_fmt,
                            ^

LLVM14::
---- for LLVM 14
wrote calltest.ll
lli: lli: calltest.ll:5:29: error: '@printf' defined with type 'i32 (i8*, ...)*' but expected 'i32 (i8
*, ...)* ([5 x i8]*, double)*'
  %r0 = call i32(i8*, ...)* @printf([5 x i8]* @d_fmt,
                            ^

LLVM15::
---- for LLVM 15
wrote calltest.ll
lli: lli: calltest.ll:5:29: error: '@printf' defined with type 'i32 (i8*, ...)*' but expected 'i32 (i8
*, ...)* ([5 x i8]*, double)*'
  %r0 = call i32(i8*, ...)* @printf([5 x i8]* @d_fmt,
                            ^

LLVM16::
---- for LLVM 16
wrote calltest.ll
3.141593
LLVM17::
---- for LLVM 17
wrote calltest.ll
0.000000
LLVM18::
---- for LLVM 18
wrote calltest.ll
0.000000
$

If I then modify the code to remove the trailing * as you did, I get the following:

$ for X in $(seq 16 18) ; do echo LLVM$X:: ; runhaskell calltest.hs ; nix shell nixpkgs#llvm_$X -c lli calltest.ll ; done
LLVM16::
---- for LLVM 16
wrote calltest.ll
3.141593
LLVM17::
---- for LLVM 17
wrote calltest.ll
3.141593
LLVM18::
---- for LLVM 18
wrote calltest.ll
3.141593
$

And indeed, in https://releases.llvm.org/17.0.1/docs/LangRef.html#call-instruction the examples dropped the *, but there's no corresponding description changes for the call instruction. The * is also not present in LLVM 16 docs (https://releases.llvm.org/16.0.0/docs/LangRef.html#call-instruction), where both forms seem to work, or the LLVM 15 docs, but it is present in LLVM 14 (https://releases.llvm.org/14.0.0/docs/LangRef.html#call-instruction).

I suspect that the removal of the pointer indirection for function types corresponds to the shift to Opaque Pointers in LLVM, where pointers no longer carry type information (https://llvm.org/devmtg/2022-04-03/slides/keynote.Opaque.Pointers.Are.Coming.pdf).

At this point, I think the right path forward is to update the type returned by decFunType (in Text.LLVM.AST) to remove the PtrTo indirection... but under what circumstances? The caller would need to indicate the desired LLVM version at AST construction time rather than pretty-printing time, or else we need a more flexible internal representation of these types that allows for version-specific handling at the right points.

For the moment, if you are targeting LLVM 17 (or potentially LLVM 15+), you can use your kall override or modify the way you declare the "printf" function, although that might have other effects.

alpmestan commented 2 months ago

Yeah for my use case I'm most definitely happy to just run with kall, no worries there, especially given it's (for now) a personal project.

I'm sorry this sent you down some very deep rabbit hole, but am very thankful for the detailed exploration.

At this point, I think the right path forward is to update the type returned by decFunType (in Text.LLVM.AST) to remove the PtrTo indirection... but under what circumstances? The caller would need to indicate the desired LLVM version at AST construction time rather than pretty-printing time, or else we need a more flexible internal representation of these types that allows for version-specific handling at the right points.

Yeah the fact that "the right (LLVM) type" is dependent on the LLVM version is quite messed up. But it also seems tricky to just hand the problem over to all users either (including future me who may have forgotten about this in 6 months), as many will likely not know about those changes.

For what it's worth, my personal preference probably goes to allowing the library be a bit closer to "what the user thinks and means" when it comes to codegen. Most of the time we're just declaring, defining and calling things, so I'd probably be happy if the library could be made a bit more flexible so as to easily encode what I want while retaining all the info it needs to pretty-print the right incantation later, once the LLVM version is known.

(The first solution sounds to me like it may be a bit too confusing for users. "What do you mean I need to give the LLVM version to this function, in the middle of the codegen logic?")

RyanGlScott commented 2 months ago

I disagree that we need to do something differently based on the LLVM version here. Rather, I think using call i32(i8*, ...)* @printf(...) was always an error. As evidence, note that if you compile this program with any version of Clang:

#include <stdio.h>

int main(void) {
  printf("%d\n", 42);
  return 0;
}

Then you'll get something that looks like:

$ ~/Software/clang+llvm-10.0.0/bin/clang -emit-llvm -S test.c -o test.ll
ryanscott at pop-os in ~/Documents/Hacking/C
$ cat test.ll

<snip>

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 42)
  ret i32 0
}

Note that call i32 (i8*, ...) @printf(...) might be call i32 (ptr, ...) @printf(...) if you are using an LLVM version that supports opaque pointers, but that's not particularly important here. What is important is that regardless of what LLVM version you use, the argument type will not be (i8*, ...)* (or ptr). If I try to change call i32 (i8*, ...) @printf(...) to call i32 (i8*, ...)* @printf(...) as re-assemble the program, I get the following error:

$ ~/Software/clang+llvm-10.0.0/bin/llvm-as test.ll -o test.bc
/home/ryanscott/Software/clang+llvm-10.0.0/bin/llvm-as: test.ll:12:29: error: invalid forward reference to function 'printf' with wrong type: expected 'i32 (i8*, ...)*' but was 'i32 (i8*, ...)* (i8*, i32)*'
  %2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 42)
                            ^

Therefore, I think the inclusion of the extra pointer type in the list of examples in https://releases.llvm.org/14.0.0/docs/LangRef.html#call-instruction was simply an oversight (that has since been fixed). As such, I think we should just remove the trailing * from the list of argument types.

kquick commented 2 months ago

The documentation shows the * back through at least v2.6, so it's been there for quite some time, but it's also just the documentation. I was concerned that the pointer version may have been the only acceptable form for an older LLVM, but I also verified that back through at least LLVM 5 it supports the non-pointer form, so I'd agree we don't need to make this version specific and can just update call per the original report.

alpmestan commented 2 months ago

Patch in #145 - review/feedback welcome :)

RyanGlScott commented 2 months ago

Fixed in #145.

GaloisInc / llvm-pretty

Subtly incorrect logic in `Text.LLVM.call` ? #144