GaloisInc / llvm-pretty-bc-parser

Parser for the llvm bitcode format
Other
60 stars 6 forks source link

llvm-hs question #160

Open andrew-wja opened 3 years ago

andrew-wja commented 3 years ago

Hi all,

I'm a new maintainer of llvm-hs, the evolution of the old llvm-general project. While discussing switching the GHC LLVM backend to llvm-hs, it was pointed out that llvm-pretty exists, and that while it doesn't provide full coverage of LLVM IR constructs, a big advantage of llvm-pretty is that this project (a pure Haskell bitcode reader) exists, which means bitcode loading doesn't have to require linking with libLLVM.

While llvm-hs has a pure Haskell AST, and a pure Haskell LLVM assembly printer, it does not have a pure Haskell method of ingesting LLVM IR in either bitcode or assembly formats. For that, we use libLLVM via the FFI, and lift the C++ datastructures to the pure Haskell AST with an encoder/decoder monad.

I'm just wondering if there would be any interest in porting this parser so that it parses to the llvm-hs-pure AST. Like llvm-pretty this AST is pure Haskell. Unlike llvm-pretty, the Haskell AST maps one-to-one to the LLVM C++ AST, and is a complete mapping of the C++ system of datatypes (to the point that every entity in our Haddock has a link to a corresponding entity in LLVM doxygen). This allows someone who is familiar with LLVM to immediately start working with llvm-hs. We have all the same monadic code generation machinery in llvm-hs-pure that llvm-pretty has, to the point where I've been wondering if the author of llvm-pretty was aware of the existence of llvm-hs-pure!

It might aid the development of this parser if it was trivially easy to compare your parsed AST with the AST produced by LLVM's own bitcode loader. Similarly, it might be useful to lower the parsed AST to bitcode via the native LLVM bitcode writer, and see that it roundtrips correctly. Both of those capabilities are offered by llvm-hs in one line of code.

It seems like joining forces may be beneficial to both projects!

atomb commented 3 years ago

That's a very intriguing possibility. It does seem as though llvm-hs-pure may have more complete coverage of the LLVM language, and as you mention it lines up directly with the C++ AST.

There are two possible downsides, as I see it, though they may not be showstoppers:

I think the bottom line is that I'm not sure any of the people currently working on llvm-pretty-bc-parser would be likely to have time to port it over to use the llvm-hs AST, but I can imagine we might seriously consider a PR. We'd probably need to assess the difficulty in migrating crucible-llvm before committing, though.