Open jfaure opened 3 years ago
This is actually very straightforward to do, because LLVM exposes this through the C API: https://llvm.org/doxygen/group__LLVMCTarget.html
However, I don't quite understand exactly what you mean by "not being able to influence code generation". Do you have an example?
Allocating a tagged union: you want to allocate enough space for the biggest member (this cannot reliably be achieved by inspecting the llvm-hs types who don't know the pointer size and alignment details)
Ah, I see. I was only thinking about datatypes that LLVM provides, but a tagged union is indeed tricky. You would need to iterate over the union types and statically determine which is largest. Using gep
only works if you can define the type using the builtin types already provided by LLVM!
Sounds like the use case is maximumBy (comparing getTypeAllocSize)
, which we can add to the tests to make sure that that functionality continues to work. I'll see if I can add getPointerSize
and getTypeAllocSize
to the FFI!
A couple other use-cases:
I just ran also in the issue of needing DataLayout
functionality.
Here's the C code I'm trying to port:
#define DESIRED_NUM_KEYS \
(((BLOCK_SIZE > sizeof(struct node_data)) \
? BLOCK_SIZE - sizeof(struct node_data) \
: 0) / sizeof(value))
#define NUM_KEYS (DESIRED_NUM_KEYS > 3 ? DESIRED_NUM_KEYS : 3)
typedef struct node
{
node_type type;
struct node_data meta;
value values[NUM_KEYS];
} node;
Basically: I need "sizeof
" so that I can use that result during codegen to determine length of an array in some other type.
I added this as a sort of hack some time back https://hackage.haskell.org/package/llvm-hs-pure-9.0.0/docs/LLVM-AST-Constant.html#v:sizeof
@jfaure I don't think that works? ArrayType
requires a Word64
for size: https://hackage.haskell.org/package/llvm-hs-pure-9.0.0/docs/LLVM-AST-Type.html#t:Type
That's no problem; It wraps the type with some llvm instructions, the size won't be available for you like with datalayout, but you can use it in the emitted llvm where it will hopefully be constant folded
@jfaure How then? It just doesn't typecheck..? Also there's no function to go from Constant
to Word64
.. and the other functions in that module are partial and would error out if I tried converting that way.
I would prefer defining my types all using the typedef
function which uses the LLVM.AST.Type
I mentioned earlier. For this I think the only way is with DataLayout
..
BTW: here's what I tried:
experiment :: ModuleBuilder ()
experiment = do
s <- typedef "struct_t" $ Just $ StructureType False [i8, i64]
let x = Constant.sizeof s
let a = ArrayType x i32 -- Couldn't match expected type 'Word64' with 'Constant'
-- ...
@andrew-wja I'm not familiar with the codebase but if you give me some high level pointers on how to best approach this, I can try giving it a shot..
@luc-tielen I understand what you want to do, but I don't think it's possible with llvm-hs
right now, so you're correct to post under this issue.
LLVM wants you to pass an integer to the ArrayType
constructor, even in C++: https://llvm.org/doxygen/classllvm_1_1ArrayType.html#adf411edc4f135b570ab218079474ce77
So you really do need to ask libLLVM through an IO operation what the size of the laid-out struct type is.
Right now, it isn't possible using llvm-hs
to construct any type that depends on IR-level values. It might be possible to work around this in your code generation. For example, you can use alloca
to allocate an array with an IR-level Operand
element count. In this case that's not very appealing, though.
This got me a little further for my specific case (it looks like some datalayout functionality is exposed in internals?):
experiment :: ModuleBuilderT IO ()
experiment = do
s <- typedef "struct_t" $ Just $ StructureType False [i8, i64]
size <- liftIO $ do
s' <- Context.withContext $ flip runEncodeAST $ encodeM s
let dl = defaultDataLayout LittleEndian
DL.withFFIDataLayout dl $ flip DL.getTypeAllocSize s'
print ("size =", size)
This snippet works if you use i8
or any of the other builtin types instead of s
in the encodeM
function, but with my custom struct I get EncodeException "reference to undefined type: Name \"struct_t\""
If I could get an up-to-date DataLayout
inside the ModuleBuilder
monad (like for example a currentDatalayout
helper function), my problem would be fixed?
@luc-tielen mixing and matching between the high-level llvm-hs-pure
and low-level llvm-hs
FFI interface directly in this way is uncharted territory, but it makes sense that builtin types should always be visible.
I think what is happening is that the explicit runEncodeAST
is blowing away the local encode state, so the type definition is no longer visible. If you look at what happens in the EncodeM
instance for Type
, specifically for NamedTypeReference we end up calling lookupNamedType. However, if you look at the definition of runEncodeAST it creates a new, empty encode state.
runEncodeAST
is designed to be the top-level entry point to the encoding, but your code snippet is calling it inside a module builder context. I think if you add a runEncodeAST'
which takes an existing encode state as a parameter and extends it, rather than running the AST encoding in a fresh encode state, that should solve your problem.
@andrew-wja I tried my hand at it today, but a fix is non-obvious (atleast to me).
The IR / Module builder monad keeps the definitions hidden internally.. you can extract them if you make a variant of runEncodeAST
that runs in ModuleBuilderT IO a
basically, but then I tried reusing some other functionally and got stuck with a cycle in my imports..
Did another attempt today:
experiment :: ModuleBuilderT IO ()
experiment = do
let n = "struct_t"
ty = StructureType False [i8, i64]
s <- typedef n $ Just ty
size <- liftIO $ do
withHostTargetMachine PIC JITDefault None $ \tm -> do
dl <- getTargetMachineDataLayout tm
Context.withContext $ flip runEncodeAST $ do
createType n ty
s' <- encodeM s
liftIO $ DL.withFFIDataLayout dl $ flip DL.getTypeAllocSize s'
print ("size = ", size)
createType :: Name -> Type -> EncodeAST ()
createType n ty = do
(t', n') <- createNamedType n
defineType n n' t'
setNamedType t' ty
This prints out size = 16 for me. Not obvious at all, but it works. Now I need to refactor it and figure out a way to nicely integrate it in my compiler :sweat_smile:.
It iss realistically necessary to be able to use getPointerSize and getTypeAllocSize (at least) from https://llvm.org/doxygen/classllvm_1_1DataLayout.html
The sizeof as a gep to then end of a nullpointer hack has the disadvantage of not being able to influence code generation.
getTypeAllocSize in particular requires us to pre-convert llvm-hs types to C++ so this may be tricky.