Open tomeichlersmith opened 3 years ago
Since you mention that it is a "drastic change" and since you mention you're still "collecting notes," I'll take the liberty to provide some more background that is hopefully helpful in the decision making. :)
That way, the python module would be pre-compiled (i.e. faster)
Actually, there is no such thing as pre-compiling python bindings. The only thing that gets compiled is the recipe of bindings construction, not the bindings themselves, so there is no (and can not be any) performance benefit. In fact, it may even be detrimental.
If you care mainly about CPU performance, boost.python
is probably also the slowest binder around. The fastest in most cases is swig
in "builtin" mode if you are using Python3, as there are optimized paths for it in the CPython interpreter for all simple cases. For most complex cases, cppyy
will beat it, assuming at least Python3.8 (which has optimized call paths for closures). The absolute fastest is cppyy
on PyPy, but then you have to switch Python interpreters. OTOH, cppyy
has the most memory overhead (because of Cling
parsing, you have to budget for an extra 100MB of memory over other binders; similarly, PyPy's memory overhead is also higher than CPython's, by about 30MB).
Functionality-wise, I recommend pybind11
over boost.python
. In style and use, they're pretty much the same, but pybind11
is a lot more advanced and I suspect at this point far more widely used. Also, if you look in their respective repos, you'll see which receives the most developer cycles these days: boost.python
is really just in "keep alive" mode with any new project choosing pybind11
over it. Also, pybind11
has no run-time and installs from PyPI and conda, really simplifying life for any cross-platform software stack.
As for ROOT
, although cppyy
still has some ROOT
heritage (and will always use Cling
), what ROOT
uses internally is a fork of cppyy
(and it's quite a bit behind master, e.g. it does not have the optimized paths mentioned above, but it also disables optimizations for the Cling JIT, inserts expensive null-checking, etc.). You can install cppyy
directly from the normal channels such as PyPI and conda-forge. If LDMX already uses ROOT
for its I/O and/or analysis needs (looks like it), then sure, use the ROOT
version of cppyy
. But otherwise it can be used independently and it doesn't tie you to ROOT
later on. (OTOH, if you are already loading ROOT
into the process anyway, cppyy
's memory overhead becomes a non-issue.)
Thank you for the correction and this extra detail @wlav ! :tada: I appreciate the input :)
@omar-moreno and I have chatted about this idea on-and-off and I am really excited about it. I've done some surface-level research and just wanted to open an issue to keep track of my notes.
Goal
The long term goal would be to get rid of
ConfigurePython
andfire
altogether. We would instead callProcess::run
directly from a script run inside python after doing all the necessary configuration. i.e. instead of having a "configuration script", we would have a running script that is really similar.Another goal that would be awesome if we can get it to work is to have a Python parent class for both the Cpp pythonizations and potentially new Python processors. i.e. Something like the following
Both of these would be run through python instead of
fire
:Tools
Both have their pros and cons from my research.
Long story short. It seems like Boost.Python would be the way to go if we were rebuilding from the ground-up. That way, the python module would be pre-compiled (i.e. faster) and we would have more control over its behavior. However, I am not interested in rebuilding from the ground up and therefore I am interested in using cppyy to "attach" our C++ objects to pythonic ones similar to how ROOT does it (versions > 6.18ish).
Plan
This is a drastic change to the Framework code-base, so I think this would necessarily be a long way off. Since this is also a big change in terms of user-interaction, we would need to be patient with merging anything like this in and potentially make a release separating our current method from this more pythonic one. For now, I am just collecting notes and links and maybe dipping my toe into the coding pool.