bluescarni / heyoka.py

Python library for ODE integration via Taylor's method and LLVM
Mozilla Public License 2.0
69 stars 9 forks source link

[FEATURE] Allow users to install from source when binary is unavailable #186

Open agseaton opened 2 weeks ago

agseaton commented 2 weeks ago

Is your feature request related to a problem? Please describe. It would be helpful to allow users to install from source, for example where they are working on a platform where there aren't wheels. Unfortunately this isn't currently possible with pip.

Describe the solution you'd like I think setting up a pyproject.toml file for the project would solve this. I tried to set one up myself using scikit-build-core as the backend, see here: https://github.com/agseaton/heyoka.py/commit/bdd96bb9c214956ad468a2bf4670d2cde99e7d5a

I had to make some changes to the CMake scripts, and I tried to write it in a way which hopefully wouldn't affect your tools for doing the CI etc.

It successfully builds the core module, but unfortunately when I try import heyoka it fails with the following error:

ImportError: /path/to/venv/lib/python3.11/site-packages/heyoka/core.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN6heyoka3v296detail12t_event_implIfLb0EE13finalise_ctorEN6tanuki2v14wrapINS1_14callable_ifaceIbJRNS0_15taylor_adaptiveIfEEiEEEXtlNS5_6configINS1_14empty_callableENS1_18callable_ref_ifaceIbJSA_iEEEEEtlNS5_6detail11config_baseEELm16ELm16ELb0ELb0ELNS5_9wrap_ctorE2ELNS5_14wrap_semanticsE0ELb1ELb1ELb1EEEEEfNS0_15event_directionE

This seems to be an issue with finding a function in the heyoka library. I tried running nm on libheyoka.so, and found a very similar function with a slightly different signature:

_ZN6heyoka3v296detail12t_event_implIfLb0EE13finalise_ctorEN6tanuki2v14wrapINS1_14callable_ifaceIbJRNS0_15taylor_adaptiveIfEEiEEEXtlNS5_6configINS1_14empty_callableENS1_18callable_ref_ifaceIbJSA_iEEEEELm16ELm16ELb0ELb0ELNS5_9wrap_ctorE2ELNS5_14wrap_semanticsE0ELb1ELb1ELb1EEEEEfNS0_15event_directionE

The demangled versions are:

heyoka.v29.detail.t_event_impl<float, false>.finalise_ctor(tanuki.v1.wrap<heyoka.v29.detail.callable_iface<boolean, heyoka.v29.taylor_adaptive<float>&, int>, tanuki.v1.config<heyoka.v29.detail.empty_callable, heyoka.v29.detail.callable_ref_iface<boolean, heyoka.v29.taylor_adaptive<float>&, int> >{16ul, 16ul, false, false, (tanuki.v1.wrap_ctor)2, (tanuki.v1.wrap_semantics)0, true, true, true}>, float, heyoka.v29.event_direction)
heyoka.v29.detail.t_event_impl<float, false>.finalise_ctor(tanuki.v1.wrap<heyoka.v29.detail.callable_iface<boolean, heyoka.v29.taylor_adaptive<float>&, int>, tanuki.v1.config<heyoka.v29.detail.empty_callable, heyoka.v29.detail.callable_ref_iface<boolean, heyoka.v29.taylor_adaptive<float>&, int> >{tanuki.v1.detail.config_base{}, 16ul, 16ul, false, false, (tanuki.v1.wrap_ctor)2, (tanuki.v1.wrap_semantics)0, true, true, true}>, float, heyoka.v29.event_direction)

I'm not sure why this is failing to work! Perhaps something to do with how I am building the main heyoka library? Any thoughts?

As a side-note, if this could be made to work, it would allow heyoka.py to be published on spack.

bluescarni commented 2 weeks ago

Hey @agseaton ! Thanks a lot for looking into this, I have been meaning to try to understand how to integrate scikit-core in the heyoka.py build system but never found the time to learn properly. I will probably have questions at some point :)

Regarding the missing symbol, it seems to me like the only difference between the two demangled names you posted is the additional presence of tanuki.v1.detail.config_base{} in the second name. This is coming from a little library I have written and then included in heyoka, called tanuki.

My first guess is that the config_base struct in tanuki is missing proper symbol visibility attributes. Currently it looks like this:

https://github.com/bluescarni/heyoka/blob/master/include/heyoka/detail/tanuki_impl.hpp#L826

Could you try to change

struct config_base {

into

struct TANUKI_VISIBLE config_base {

Note that this will require recompiling both heyoka and heyoka.py with the new code.

It is not clear to me why this problem appears only when using scikit-core though...

agseaton commented 2 weeks ago

I gave that a try, but I'm getting the same issue.

Incidentally, I don't think this is related to scikit-build-core. If I compile heyoka.py outside of scikit-build-core and then try to import the core module from python it's the same story.

Any other ideas?

agseaton commented 2 weeks ago

Okay, I did more digging and noticed that the symbols in the heyoka library I had built were the same as those bundled with the wheels on PyPI.

So that points to something funny going on with my build of heyoka.py.

I noticed that cmake was choosing clang (18.1.6) as the compiler. When I changed it to gcc (11.4.0) it all just works.

That solves my problem, though it's not great that the build would fail with clang. My guess here is that there's perhaps some compiler flag that needs to be changed?

bluescarni commented 2 weeks ago

I noticed that cmake was choosing clang (18.1.6) as the compiler. When I changed it to gcc (11.4.0) it all just works.

Are you mixing up different compilers when building heyoka and heyoka.py?

It is indeed strange that CMake on Linux would pick clang as default compiler instead of GCC.

In the CI pipeline, we have builds that compile everything with clang and we never had the missing symbol issue you are experiencing. See e.g.:

https://github.com/bluescarni/heyoka.py/actions/runs/10054958430/job/27790582141#step:3:3215

However, all clang builds in the CI are on OSX, so perhaps you discovered a genuine issue that only appears with clang on Linux.

agseaton commented 2 weeks ago

Ah I think I realised what is going on here. I wasn't aware that C++ libraries produced by different compilers are not compatible, and that typically everything is exposed through a C interface via extern "C", which is not what you're doing here.

In my case I'd used spack to build heyoka itself, which used my system's gcc to build it and all of the dependencies (including LLVM and therefore clang). When I then tried to build heyoka.py (outside of spack), cmake saw the LLVM installation and used clang.

To prevent people like myself who aren't well versed in these details from making mistakes, perhaps it would be helpful to put a warning about this in the documentation for heyoka? Perhaps on one (or both) of these pages: https://bluescarni.github.io/heyoka/install.html https://bluescarni.github.io/heyoka/basic_tutorials.html

Also, I see you have a warning that maybe alludes to this issue in the installation instructions in the python documentation under installing with pip, but it would be helpful to make this clearer. I think it would also be good to mention in the section on building from source.

Thinking about how this would impact packaging via pip or spack I see some issues. Fundamentally the idea with spack is that it allows users lots of flexibility in how things are built. So you can build package X with gcc and package Y with clang if you like. I don't know if it has a mechanism for enforcing that dependencies are built with a compiler that has binary compatibility, but I could ask on their issue tracker. I guess with pip it might be reasonable to just hope that anyone who's attempting to build from source uses the same compiler for heyoka and heyoka.py.

bluescarni commented 2 weeks ago

@agseaton this is a hard problem to tackle in general.

For a complex C++ codebase, I do not think that going with the extern "C" route is feasible. In heyoka's case at least, the wrapping code would certainly balloon in size with respect to the current pybind11-based approach.

There are steps that in theory could be taken in heyoka/heyoka.py's build systems to warn the user about the use of different compilers/standard libraries/etc. For instance, one could come up with some unique name mangling scheme that encodes in a string the platform setup used to build heyoka (e.g., compiler, compiler version, standard library, etc.), export this string in the CMake config file and then, when compiling heyoka.py, re-compute the same mangled name and compare it with the one provided in heyoka's CMake config file. If they differ, an error could be raised.

I am not sure this would be worth the effort though.

The fundamental underlying issue is that heyoka.py is a multi-language package and the only way to distribute it without incurring in these sorts of issues is via a multi-language package manager that is able to correctly reason about the C++ parts of heyoka.py (such as, e.g., a Linux distribution package manager like apt/pacman/rpm or something like conda/vcpkg/conan).

I am well aware of the fact that a large part of the Python community does not care about these details and they (understandably) want to be able to just pip install, and I would like to support this usage as much as possible as well (so your work towards making the heyoka.py build system more conformant with modern Python packaging practices is very important and appreciated).

I am not following closely the evolution of the Python packaging ecosystem, but my impression is that is tacitly understood that, when installing packages from source, it is up to the user to ensure that they are consistently using the same compilation toolchain, and I do not think that the Python packaging tools have a way to detect these sorts of binary incompatibilities.

So at this time probably the most realistic step we could take is to, as you suggest, add warnings about these potential issues in the installation from source sections. I will take care of doing this.

agseaton commented 2 weeks ago

Yes I agree, this seems like a tricky one, and I take your point that there's no perfect solution. By the way, I wasn't trying to suggest that you should go down the extern "C" route. I can see how that would be much more cumbersome!

And yes, it does seem that the lowest hanging fruit would just be to add some warnings in the documentation to avoid building heyoka and heyoka.py with different compilers.

Plus, in any case, you've provided pre-built binaries for most major platforms so it shouldn't be necessary in most cases for people to compile from source.

Also, I have set up a draft spack package spec for heyoka.py (see here: https://github.com/agseaton/spack/commit/4bb1a9ead102326f2c9afae2038fcda06d32f57a). It patches heyoka.py with the changes to the build system I linked to above in order to allow spack to compile it. Would you be happy for me to submit that for inclusion in spack once I have finished working on it?

bluescarni commented 2 weeks ago

Hi @agseaton !

Yes I agree, this seems like a tricky one, and I take your point that there's no perfect solution. By the way, I wasn't trying to suggest that you should go down the extern "C" route. I can see how that would be much more cumbersome!

And yes, it does seem that the lowest hanging fruit would just be to add some warnings in the documentation to avoid building heyoka and heyoka.py with different compilers.

:+1:

Plus, in any case, you've provided pre-built binaries for most major platforms so it shouldn't be necessary in most cases for people to compile from source.

Also, I have set up a draft spack package spec for heyoka.py (see here: agseaton/spack@4bb1a9e). It patches heyoka.py with the changes to the build system I linked to above in order to allow spack to compile it. Would you be happy for me to submit that for inclusion in spack once I have finished working on it?

If I have understood correctly, with the pyproject.toml that you wrote and the small modifications to the CMakeLists.txt, you are now able to build heyoka.py from source using the standard Python tooling right?

If that is the case, I would actually ask if you could consider testing out if (in addition to the compilation from source) the production of a binary wheel also works with your modifications.

heyoka.py has a kind of ad-hoc/bespoke/hacky way of producing binary wheels on Linux, which uses CMake to setup some setup.py and wheels.cfg files - I cannot remember exactly right now, but it is sort of the opposite of what scikit-core does (which is, calling CMake from the standard Python tooling).

I would be really happy if we could toss away all this custom machinery and instead rely on scikit-core + your pyproject.toml (+ the small modifications to CMakeLists.txt). I have been thinking for a while that we should offer precompiled wheels on more platform than just Linux and your work in this area would be very valuable.

agseaton commented 1 week ago

Yep, that's the idea. And sure, I tested it just now and I'm able to build a wheel via pip3 wheel <heyoka_source_dir>.

So yes, I think it should be possible to simplify the scripts you have. I think you're referring to this script?

bluescarni commented 1 week ago

Yep, that's the idea. And sure, I tested it just now and I'm able to build a wheel via pip3 wheel <heyoka_source_dir>.

That's great to hear!

So yes, I think it should be possible to simplify the scripts you have. I think you're referring to this script?

That script will certainly have to be adapted.

But mainly what I am referring to is the fact that currently the process to build the wheels starts here:

https://github.com/bluescarni/heyoka.py/blob/main/heyoka/CMakeLists.txt#L4

I.e., from within CMake we are generating setup.py and setup.cfg files, which, as far as I have understood, are now obsolete/deprecated. Then, the process of building the wheels goes through these stages (see the lines starting here: https://github.com/bluescarni/heyoka.py/blob/main/tools/gha_manylinux.sh#L85):

So as you see things are done in a sort-of-backwards fashion, using deprecated python tooling in a way that is very non-standard from the point of view of modern Python packaging.

What I would like to achieve with your modifications would be the following:

Basically, my goal is to have a build system which is friendly both to multi-language software distributions (i.e., conda or a Linux distro package manager) and to Python users who instead prefer to use pip etc. to manage their Python packages.

agseaton commented 1 week ago

That sounds reasonable to me! I think most of what you're suggesting involves removing redundant lines from scripts so I can't imagine it would be too much work.

I'm guessing there might be some pain involved in making sure the CI handles this okay but hopefully that wouldn't be too bad.

And yes, I think the auditwheel repair step is still required.