rheostat2718 / unladen-swallow

Automatically exported from code.google.com/p/unladen-swallow
Other
0 stars 0 forks source link

Build LLVM JIT as a module #136

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The patch available at

http://codereview.appspot.com/206091

supports building the LLVM JIT as a dynamic extension module, _llvmjit. The
basic idea is that all calls into LLVM get indirected by a pointer. Loading
the extension module sets the pointer; until it is loaded, Python won't do
any JIT.

The advantage of this approach is that it allows better unbundling of LLVM
from Python:
- The python interpreter (either python or pythonxy.dll) can be compiled
purely with a C compiler, and doesn't depend on the C++ runtime.
- Linux distributions can provide a "minimal" Python (not including JIT),
and then install JIT as a separate package
- py2exe users can chose to not package LLVM in their distributions.

This patch is a rough proof of concept; I would appreciate if U-S
developers could enhance it. I can also spend some (limited) time on it,
myself.

This biggest flaw in the current implementation is probably the ignorance
of position-independent code: if LLVM gets compiled into a shared library,
it also should get compiled with --enable-pic, probably. I have tested the
patch only on Linux, where PIC code in shared libraries is not a strict
requirement.

Original issue reported on code.google.com by mar...@v.loewis.de on 13 Feb 2010 at 4:49

GoogleCodeExporter commented 9 years ago
Issue 140 has been merged into this issue.

Original comment by reid.kle...@gmail.com on 20 Feb 2010 at 7:54

GoogleCodeExporter commented 9 years ago
Implemented in r1110.  I tested deleting _llvmjit.so and running regrtest.py. 
test_llvm.py automatically skips itself because it is unable to import _llvm.  
Anyone
redistributing Python will no longer need to do a special build to cut out LLVM 
and
the JIT.

Original comment by reid.kle...@gmail.com on 27 Feb 2010 at 6:15

GoogleCodeExporter commented 9 years ago
I want to reopen this issue and discuss this solution further. I'm sorry I 
didn't weigh in before it was 
committed.

After r1110, we now have four different ways of disabling the JIT:

- ./configure --without-llvm
- python -j never ...
- _llvm.set_jit_control("never")
- rm _llvmjit.so

`rm _llvmjit.so` and `--without-llvm` actually disable the JIT in significantly 
different ways: `rm _llvmjit.so` 
leaves a lot of infrastructure in place, such as dict watchers, that 
`--without-llvm` removes. That makes me 
uncomfortable.

Concerning """Linux distributions can provide a "minimal" Python (not including 
JIT), and then install JIT as a 
separate package""", this would seem to imply that distros will ship a default 
Python 3 that does not include 
the JIT. (I assume that having "sudo apt-get install python3-nojit" actually 
delete _llvmjit.so would not be 
acceptable to downstream packagers; let me know if that is incorrect) If the 
JIT is not on by default for 
supported platforms, it will receive far less testing and will be of less 
benefit to end users.

Have any downstream packagers asked for this (rm _llvmjit.so) as opposed to 
doing a new clean build with --
without-llvm? From the python-dev thread, David Malcolm from RedHat is inclined 
to turn the JIT on by 
default in Fedora, which your patch would seem to preclude.

I'm sympathetic to the case of py2exe, but I'd like to discuss other options 
before settling on the approach in 
r1110. There seem to be two related options: python.org could do --without-llvm 
releases; or py2exe could 
include a --without-llvm build of Python in the py2exe tool itself (which would 
save the python.org release 
managers from doing twice as many releases).

I'm concerned about the impact on maintainability and feature development that 
this patch will have, and 
indeed is already having. One particular case in point: I have a patch where I 
want to use some of LLVM's ADTs 
(llvm::SmallPtrSet, specifically). Under the current approach, I would have to 
expose every method of every 
data type I want to use via _PyLlvmFuncs, or else re-implement the ADT from 
scratch to avoid pulling in 
libstdc++. In practice, this means that the _PyLlvmFuncs struct will grow ever 
larger as we want to use more 
ADTs when implementing optimizations. Having to expose these data structures 
via _PyLlvmFuncs seems 
strictly inferior compared to the status quo ante.

Given that r1110 continues to break the build, I am going to revert it and its 
successor revisions; it is more 
important that we made forward progress on other issues and not have all 
patches blocked on this one change. 
Let's discuss this further, and we can resurrect the necessary patches from SVN 
later.

Original comment by collinw on 3 Mar 2010 at 12:09

GoogleCodeExporter commented 9 years ago
The TLDR version: making the JIT a module inhibits progress today, so we should 
punt
on it.  However, we should keep it in mind as something to think about when 
rebasing
onto py3k.

In hindsight, I agree, the way that r1110 was implemented introduced many
complications for us.  We've gotten a lot of mileage out of being able to stick 
hooks
into things like dicts and type objects so they can communicate with the JIT, 
and the
implementation of r1110 made that a pain.  So, I think we should keep it in 
mind as a
possible solution for the future, perhaps when we rebase onto py3k.

I mainly think that building the JIT as a module that we can load lazily will 
be a
great way of silencing concerns about startup time and memory usage for scripts 
where
the JIT is inappropriate.  It would be neat if Python just did the Right Thing: 
if
your code is short-running or IO-bound, none of the code gets warm, and we 
never load
the JIT or record feedback.  Once something gets warm, we load the JIT and start
recording feedback.  This also depends on recompilation, because if we don't 
record
feedback right from the get-go, we're unlikely to compile everything correctly.

I also think moving the JIT into a module makes it easier to ensure that python 
built
with and without the JIT will remain ABI compatible, because there simply won't 
be
two versions of the binary to worry about.  That just sounds simpler from a
maintenance perspective.  I am not a packager or particularly experienced with
maintaining ABI compatibility, though.  :)

Original comment by reid.kle...@gmail.com on 3 Mar 2010 at 5:54

GoogleCodeExporter commented 9 years ago
Re packaging: I can't speak for Debian/Ubuntu, but they do provide a package 
called
python-minimal, which will get installed on any Debian system and is intended 
for use
by startup scripts, dpkg pre/post scripts, and the like, see

http://packages.ubuntu.com/de/karmic/python2.6-minimal

It currently is 1.3MB on x86, so I expect Debian maintainers would be unhappy 
if that
would need to include the JIT. You should ask Matthias Klose if you want to be 
certain.

The full python2.6 package currently is (an additional) 2.3MB. Whether or not 
the
maintainers could agree to include the JIT in that package, I don't know. It 
does
depend on additional libraries (libdb4.x, libreadline, libsqlite, libncursesw,
libbz2), but I suspect these are all libraries that are present on a typical 
Debian
installation, anyway. Some modules are packages as separate packages, though 
(tk and
gdbm, not because of their size, but because of their dependencies).

Original comment by mar...@v.loewis.de on 3 Mar 2010 at 8:53

GoogleCodeExporter commented 9 years ago
The libllvm2.7 package (on Ubuntu lucid) adds 5.4MB (compressed), 14.7MB
(uncompressed) to the installation.  The interpreter is found on the Live- and
install CDs, so a package of this size would require removing other software on 
the
CDs. To avoid this on the CD, one could build python3 twice, and put the jit 
enabled
python in a python3-jit package, which is not on the CD, and which diverts 
(renames
and replaces) the jitless python3 on installation. Having the jit built as an
extension would not need tricking around with diversions, reducing the build 
time
(building only once), and having only one binary on the system installed.

The JIT should be installed by default with the interpreter, but this can be 
handled
by dependencies and recommendations as well, without having to put it into the 
same
package.

Didn't play around with things, but it might be interesting to disable the JIT 
during
system start, unsure, how to do that, and if its worth it.

Original comment by d...@ubuntu.com on 22 Mar 2010 at 5:28