numba / llvmlite

A lightweight LLVM python binding for writing JIT compilers
https://llvmlite.pydata.org/
BSD 2-Clause "Simplified" License
1.92k stars 316 forks source link

Allow llvmlite to also link object code #311

Open certik opened 6 years ago

certik commented 6 years ago

Currently llvmlite can produce the LLVM IR source code by:

str(module)

it can also read this IR source code and generate a machine code object file using:

    llvm.initialize()
    llvm.initialize_native_asmprinter()
    llvm.initialize_native_target()
    target = llvm.Target.from_triple(v.module.triple)
    target_machine = target.create_target_machine()
    mod = llvm.parse_assembly(str(module))
    mod.verify()
    with open("%s.o" % basename, "wb") as o:
        o.write(target_machine.emit_object(mod))

But it seems llvmlite does not have the functionality to actually link these object files into an executable. One has to do it by hand, e.g. using gcc:

gcc -o a.out file.o

or clang:

clang -o a.out file.o

or using ld directly, though the exact invocation of ld is platform dependent (one has to link the C library manually, as well as the crt1.o). One can pass the -v option to either gcc or clang to figure out the platform dependent line for ld, e.g. on my machine it is:

 "/usr/bin/ld" -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o c /usr/lib/gcc/x86_64-linux-gnu/6.1.1/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/6.1.1/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/6.1.1/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/6.1.1 -L/usr/lib/gcc/x86_64-linux-gnu/6.1.1/../../../x86_64-linux-gnu -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/6.1.1/../../.. -L/home/certik/repos/spack/opt/spack/linux-ubuntu16.04-x86_64/gcc-7.2.0/llvm-5.0.0-vulzawogiwpyst64drjcp5wxgl5inldr/bin/../lib -L/lib -L/usr/lib expr2.o -L/home/certik/repos/spack/opt/spack/linux-ubuntu16.04-x86_64/gcc-7.2.0/llvm-5.0.0-vulzawogiwpyst64drjcp5wxgl5inldr/lib -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/6.1.1/crtend.o /usr/lib/gcc/x86_64-linux-gnu/6.1.1/../../../x86_64-linux-gnu/crtn.o

However, since clang knows how to link object files into an executable, there must be a way to link from C++, it might not be very well exposed, but at least the clang driver must know how to do that.

It would be nice to expose this from llvmlite.

Alternatively, if that is out of scope of llvmlite, why does llvmlite have the functionality to produce object files in the first place, if they can't be linked? One does not need the object file for the JIT function (for numba) it seems. So if the philosophy behind llvmlite is to just expose JIT, then it doesn't need to emit object files at all.

I personally think if there was a way to expose the linking from llvmlite, then one can write a full compiler using llvmlite only (e.g. conda install llvmlite), so that would be very nice. As it is now, one can generate object files, but the final step still requires either gcc or clang to be installed to do the linking.

seibert commented 6 years ago

llvmlite is also used by Numba to produce shared libraries for the ahead-of-time compilation mode. Some of the linking logic is buried inside of Numba in places like this:

https://github.com/numba/numba/blob/a4e6d6689d11ddad4125a01c4e4ad19bc69c5759/numba/pycc/compiler.py

We can certainly look at moving more linking support out of Numba and into llvmlite. We have never tried to make standalone executables with Numba or llvmlite, so there may be some other subtleties to worry about (especially on multiple platforms).

certik commented 6 years ago

@seibert if you link a shared library, I think you have exactly the same problem, don't you? You have to link the object files somehow. How do you currently do it? I wasn't able to find the actual linking in the file you posted.

alendit commented 6 years ago

@certik this is surprisingly tricky using the MCJIT, since the API isn't exposed. This become straight forward with ORC, which is the new JIT for LLVM>=6. #245 ist relevant here.

matthieugouel commented 6 years ago

Hi ! I consider building a compiler using llvmlite but I'm trying not to depend on gcc/ld or clang to do the linking step. I don't know if it's really feasible with llvmlite right now or if there is a way to do the linking process via an other python package. Do you have an idea how to achieve this without depend on an external executable ?

alendit commented 6 years ago

Hi,

Your best options right now are to either compile a shared library or to emit a LLVM bitcode files and link them with the IR linker.

Cheers, Dimitri.

On Tue, Aug 21, 2018, 20:41 Matthieu Gouel notifications@github.com wrote:

Hi ! I consider building a compiler using llvmlite but I'm trying not to depend on gcc/ld or clang to do the linking step. I don't know if it's really feasible with llvmlite right now or if there is a way to do the linking process via an other python package. Do you have an idea how to achieve this without depend on an external executable ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/numba/llvmlite/issues/311#issuecomment-414779549, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE83TpK_SqmJ4OYqQrx6Wq-YgstMK99ks5uTFQ6gaJpZM4QNAPk .

matthieugouel commented 6 years ago

Thanks for your response ! When you say IR linker do you speak about llc? To my understanding it's not binded by llvmlite so I have to install it independently, right ?

alendit commented 6 years ago

Hi, I mean this method https://llvmlite.readthedocs.io/en/latest/user-guide/binding/modules.html with which you can link IR modules.

On Aug 21, 2018 22:04, "Matthieu Gouel" notifications@github.com wrote:

Thanks for your response ! When you say IR linker do you speak about llc? To my understanding it's not binded by llvmlite so I have to install it independently, right ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/numba/llvmlite/issues/311#issuecomment-414803582, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE83VUi7dJ6d-QK4V-g-D5ChHrIpt0pks5uTGe_gaJpZM4QNAPk .

matthieugouel commented 6 years ago

Hi, I think maybe my question wasn't very clear so I apologies. In fact what I would like is to be independent of an external binary (gcc, clang, ...) for doing the conversion between an object file into binary executable. ( ie doing that gcc -o a.out gen_obj_by_llvm.o but without gcc).

There is the similar question here but I was wondering if the response could have changed now.

If it's not possible for doing that with llvmlite or any other Python package, I would have to say that for instance gcc is a requirement for using my compiler, "just" for doing the last step of converting an object file generated via llvmlite into a usable executable.

Cheers, Matthieu

alendit commented 6 years ago

Hi,

What's format are your object files in? If it's LLVM bitcode then you can load the modules and link them in using llvmlite (parse_bitcode and link_in). If you emit modules in elf format, you'd need shared libraries which you can link with load_library_permanently. If you have static elf object files, you can't link them with llvmlite directly right now.

Cheers.

On Wed, Aug 22, 2018, 09:18 Matthieu Gouel notifications@github.com wrote:

Hi, I think maybe my question wasn't very clear so I apologies. In fact what I would like is to be independent of an external binary (gcc, clang, ...) for doing the conversion between an object file into binary executable. ( ie doing that gcc -o a.out gen_obj_by_llvm.o but without gcc).

There is the similar question here https://stackoverflow.com/questions/44709751/produce-binarycode-from-ir-generate-from-llvmlite but I was wondering if the response could have changed now.

If it's not possible for doing that with llvmlite or any other Python package, I would have to say that for instance gcc is a requirement for using my compiler, "just" for doing the last step of converting an object file generated via llvmlite into a usable executable.

Cheers, Matthieu

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/numba/llvmlite/issues/311#issuecomment-414936042, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE83RDk2s8YmPXOc43PJDfHBZJM3aYWks5uTQWtgaJpZM4QNAPk .

matthieugouel commented 6 years ago

Ok I think I get it. At first I have a llvmlite module generated by my compiler. If at this step if I have multiple modules I can use link_in to bundle them into one module.

Then I want to convert that module into an executable so I have two choices : either convert it into a shared library or into a static elf object file (with emit_object).

For my case I think unfortunatly the best way is to emit an elf object and then use the target machine standard linker to convert that object into an executable. I'm dreaming but i would be great to have an emit_binary that convert a module into an executable using a binding to lld or something, but I think it's more complicated that I can imagine :D

Thanks for you patience, Matthieu.

certik commented 5 years ago

I just found out that what I want in this issue is already implemented as the LLVM linker lld: "You can embed LLD to your program to eliminate dependency to external linkers. All you have to do is to construct object files and command line arguments just like you would do to invoke an external linker and then call the linker’s main function, lld::elf::link, from your code."

So all that is needed is to expose the function lld::elf::link in llvmlite and that should be it.

Here is an example how to use lld::elf::link: https://github.com/llvm-mirror/lld/blob/45d5724805314a73fc012377d32865107d2a2abd/tools/lld/lld.cpp#L132

certik commented 5 years ago

It looks like llvmlite only depends on the llvmdev Conda package, which is built from the http://releases.llvm.org/6.0.1/llvm-6.0.1.src.tar.xz tarball. However, the lld library is distributed in the http://releases.llvm.org/6.0.1/lld-6.0.1.src.tar.xz tarball.

@seibert, @alendit, what should be the course of action here? Should lld be wrapped in a separate package from llvmlite, or should llvmlite be made to depend on lld?

seibert commented 5 years ago

I think we would still prefer that llvmlite statically link to lld (a build-time dependency), rather than take a runtime dependency to lld.

certik commented 5 years ago

@seibert yes, I would prefer that too. That way llvmlite is the only dependency. But how do you want to technically do that --- should the llvmdev build time Conda package dependency be extended to also include lld (i.e., download the separate tarball, unpack into llvm/tools, etc.)?

seibert commented 5 years ago

If lld can be compiled separately from LLVM, a separate lld conda package makes sense (we can build it and put it in the numba conda channel), and then we can add the llvmlite build dependency to lld, just like llvmdev.

certik commented 5 years ago

@seibert unfortunately all the documentation as well as practical examples (e.g., Spack) build lld and other packages such as clang in the llvm/tools directory inside the main llvm tree. I think that's the (only) supported way.

As such, should we add this to the llvmdev package?

seibert commented 5 years ago

I think that would be a reasonable thing to consider. How much is this likely to increase the size of llvmlite when everything is linked? We do want to keep an eye on the size of that package.

certik commented 5 years ago

@seibert ok, I'll see if I can get it working. Regarding the size, we have to try it and see.

If it is too big, then we can split it into another package, say llvmlite-lld, that would only contain the Python bindings to lld, and people that want it can install it, and people that don't want it can ignore it.

seibert commented 5 years ago

I think that is a good plan. Thanks for looking into this!

certik commented 5 years ago

Looks like conda-forge figured it out how to make lld a separate package: https://github.com/conda-forge/lld-feedstock. So I'll try to use the packages from conda-forge for now.

certik commented 5 years ago

All right, so #419 implements this, here is an example how to use it: https://github.com/numba/llvmlite/pull/419#issuecomment-439546565.

appcypher commented 4 years ago

What is the update on this? Is there a temporary workaround one can use now? The best solution I can think of rn is to bundle lld as a library and make a subprocess call. That or going the ctypes way.

overdev commented 4 years ago

@appcypher I have the same questions. Apparently it is implemented and merged, but I'm yet to find the docs on how thinks are supposed to be done.

certik commented 4 years ago

419 is not merged yet. I don't have the time to work on this right now, so if anyone wants to take it and finish it, that would be great.

gmarkall commented 2 years ago

As per https://github.com/numba/llvmlite/pull/419#issuecomment-998057666 - if anyone would like to take over the completion of #419 (@overdev / @appcypher / @matthieugouel / @alendit perhaps?) - please do let me know (further directions in the linked comment).