Open mabruzzo opened 4 months ago
Upon reflection, I don't think the dlopen
approach would work in a robust forwards compatible manner. The problem is that a number of macros are defined in headers that we need to invoke the HDF5 functions such as H5F_ACC_RDONLY
, H5P_DEFAULT
, H5S_ALL
, H5T_C_S1
, H5T_STD_I32BE
, H5T_STD_I64BE
, H5T_IEEE_F32BE
, H5T_IEEE_F64BE
, H5T_STD_B8BE
, H5T_NATIVE_INT
, H5T_NATIVE_LLONG
, H5T_NATIVE_FLOAT
, H5T_NATIVE_DOUBLE
, H5T_NATIVE_LDOUBLE
.
These are all API-stable, but not ABI stable -- the associated underlying values are technically allowed to change from 1 version to the next. [^1]
Consequently, resolving the hdf5-distribution problem actually has the following solutions:
pip install pygrackle
will download the sdist and will try to launch grackle and pygrackle locally)
There are some other crazier solutions too (probably not worth going into)
For now I say we ship a SDist and punt on the rest of the hdf5 issues.
[^1]: In principle, we could maintain a small database where we record the values of each of these macros for each of hdf5's minor version (e.g. 1.6, 1.8, 1.10, 1.12, 1.14, etc.), but I don't really think we want to do that...
Some additional questions:
grackle_defs.pxd
?
grackle_defs.pxd
(if it's something people want, we can revisit).As I talk this out, I posit we should ship as little as possible in a normal wheel. To experiment with shipping more, we should distribute Grackle and pygrackle separately on conda-forge
I'm late to this, and it seems that you have this largely under control. My main comment to make is that I agree with your assertion that dlopen
is not a workable solution.
I believe that we should ship grackle_defs.pxd
, but I am willing to concede that this may not be universal. I have in the past found myself wanting .pxd
files and I think I remember actually using the grackle pxd's in a separate project. However, that is likely to be the extreme end and I have no problem not using grackle wheels in that unlikely event.
(Note: this comment was heavily revised for clarity)
I believe that we should ship
grackle_defs.pxd
, but I am willing to concede that this may not be universal. I have in the past found myself wanting.pxd
files and I think I remember actually using the grackle pxd's in a separate project. However, that is likely to be the extreme end and I have no problem not using grackle wheels in that unlikely event.
I'm happy to do that. That makes some other decisions easier.
As I understand it, (and please correct me if I'm wrong), the cythonize functionality needs to know where to find grackle_defs.pxd
and any headers referenced therein. I think we would need to provide a function like numpy.get_include[^1] or mpi4py.get_include.
There is a wrinkle here. There are actually 2(ish) ways to build pygrackle with the new build-system and the locations of the Grackle header depends on which way is used. These approaches include
The problem is that the Grackle headers will be in different locations for both cases (we can't just hardcode them into Pygrackle).
get_include
only returns a string), is that Pygrackle's analgous function, would needs to provide a list of header directories: the directory containing .pxd
and the directory(ies) containing Grackle's public headers.For concreteness, I'm talking about adding functionality that would allow an Extension Module compiled against Pygrackle
to have a setup.py
file that looks like:
import pygrackle
#...
Extension('my-extension-name', #...
include_dirs=pygrackle.get_include_list())
At the moment, it's not clear the best way to do this sort of thing... If I generated a toml or ini file, do you know if this is generally a use-case for importlib.resources
?
EDIT: it turns out scikit-build-core has some trouble right now with editable installs and importlib.resources
. To implement this functionality, I would instead make a template-file called __config__.py.in
and have the build-system inject build-data into this file. numpy and scipy do the exact same thing (they use meson functionality with a direct CMake counterpart).
[^1]: Aside: that webpage references the fact that they ship a pkg-config file for their numpy headers (i.e. numpy.pc
). From inspecting the file, it looks like that doesn't provide much info. But, maybe we should also provide a pygrackle.pc
file (especially since #182 exposes machinery for a grackle.pc
file).
I'm late to this, and it seems that you have this largely under control. My main comment to make is that I agree with your assertion that
dlopen
is not a workable solution.I believe that we should ship
grackle_defs.pxd
, but I am willing to concede that this may not be universal. I have in the past found myself wanting.pxd
files and I think I remember actually using the grackle pxd's in a separate project. However, that is likely to be the extreme end and I have no problem not using grackle wheels in that unlikely event.
@matthewturk I've thought about this some more, and now I think I can articulate my concerns a little better.
Off the top of my head, I can name 3 python popular packages that include .pxd
files inside their source directory and are linked against standalone software. I think they correspond to 3 representative cases:
First, there is numpy
:
.pxd
with the goal of supporting numpy's C-API. The pxd files don't reference any of the header files from the external private dependencies. Instead they only reference the headers associated with numpy's C-API and python's headers.numpy.get_include
only needs to provides the path to the directory with numpy's c-api headers.Next, there is mpi4py
:
.pyd
and .pxi
files that definitely reference the <mpi.h>
header (as well as mpi4py's internal c-headers)mpi4py.get_include
only provides the path to the internal headers. End users trying to compile an extension module are entirely on their own with making sure that the appropriate paths for the mpi-headers are known to the build-systemFinally, there is h5py
:
.pxd
files are internally used to implement h5py
, they are explicitly excluded from the wheelpygrackle
If the content's of Pygrackle's .pyd
file(s) only referenced internal extension types, without mention of any grackle-headers, I would have 0 concern with including the .pyd
file(s) in the wheel. In that case, we are effectively in the numpy
scenario.
In reality, the contents of Pygrackle's .pyd
directly wrap the grackle headers.
mpi4py
or h5py
cases. mpi4py
case, (i.e. the external library wrapped by the .pyd
files was always externally provided -- and consequently the header location was clearly the installer's responsibility), I would be less concernedHowever, Pygrackle
firmly lies in the h5py
case. For that reason, we are in this weird in-between:
grackle
is embedded within the Pygrackle
wheel, we the developers are responsible for maintaining all headers within Pygrackle
. Pygrackle
is built against an external Grackle
installation, we can provide the locations where the headers were located during installation, but it's ultimately the installer's responsibility to make sure that the headers don't get deleted or moved.I guess I'm just worried about providing an inconsistent experience (when it comes to python-installation stuff I would like things to "just work"). I'm also a little worried that I may be missing other obvious cases (related to python packaging) where things could go wrong.
I would feel a lot more comfortable with all of this if we knew of a similar popular package (to the h5py
scenario) that does distribute the pyd
files in the wheel? Is anyone aware of one?
I'm also willing to accept that these concerns are overblown. One might argue that anyone who is actually linking an Extension Module should be able to understand all of these issues.[^1] Furthermore, unless someone is linking against an external classic-build, they need to go out of their way to delete/move the headers, while ensuring that Pygrackle continues to work.[^2]
A few other thoughts:
grackle_defs.pyd
ONLY references grackle headers (and doesn't reference any Pygrackle functions/types), what if we ship as a part of grackle, inside the include directory? I'm not totally sure this really solves our problems... (we might also want to rename it grackle_defs.pxi
to avoid problems).pygrackle.get_include_dirs
function that tries to provide informative errors if things go wrong. Maybe we have the function throw an error by default if grackle
isn't embedded within the wheel, with an informative error message that explains the potential ways that things can go wrong and how to override this behavior by maybe overriding a default function-argument?[^3] But I might just be over-complicating things.[^1]: with that said, this doesn't really help an arbitrary end-user of that Extension Module.
[^2]: unless you really know what you're doing (in which case, you probably can understand these challenges), it seems extremely unlikely that you would want to build an external-grackle-copy, link that to pygrackle, and then delete everything other than pygrackle and the libgrackle.so.
[^3]: For any standard package-manager situation, where pygrackle
and grackle
are distributed as separate packages, (e.g. perhaps the conda packages?) we would probably want ways to avoid raising the error-messages.
I think you make a compelling case that this is too complex to ship the pxd
s. I think that I am uncertain about the actual difficulties of distributing grackle via pypi, but I think it is worthwhile to say that linking requires extra effort on the part of the person doing the linking if they want to use the pypi package. I think going ahead with the simplest solution with the lowest probability of errors is the appropriate course.
I wanted to record a few thoughts that I've had about this topic over the past few weeks (hopefully this is semi-coherent)
Distributing binary wheels from PYPI may be a lot more doable than I originally thought! My primary concern has always been with cross-compiling hdf5 (I also have a much better understanding about the cross-compilation process than I used to)
I was concerned because the scripts used by h5py and pytables for this purpose are fairly complex (things seemed tricky in the julia packaging ecosystem as well)
With that said, it looks like a lot of effort has gone into making hdf5 version 1.14.y.z
easier to cross-compile (e.g. see HDFGroup/hdf5#1203 and other issues linked therein)
In light of the fact that we don't initially plan to ship the cython pxd
s and the general issues described in the next section related to using a precompiled grackle library for most simulations, I'm inclined to just ship the bare minimum in the wheels (at least to start). I'm definitely happy to revisit this (especially if somebody has a particular application in mind), but I think it makes sense to initially be conservative on this choice.
libgrackle.[so|dylib|a]
for general-use (e.g. for compiling a simulation code)These obstacles are more obvious in hindsight, and I have already talked about some of this elsewhere in this thread. But I think it would be helpful to restate some of this for concreteness (especially now that I appreciate the full-scope of the issue).
General-use refers to using the pre-compiled library for compiling a simulation code.
The primary obstacle relates to the fact that a precompiled version of grackle needs to be linked against an ABI compatible version of it's runtime dependency hdf5 -- If you compile against hdf5 1.x.y.z
, problems arise if you link against a library where some version-digit (other than z
) has changed.[^1] (This is also true for the fortran runtime library, used by the internal routines, but that is less of an issue[^2])
Let's consider the simplest general-purpose grackle distribution mechanism that would work for the largest number of systems involves. Simply distribute a tarball containing:
In fact, this distribution mechanism would work for developing any pure-C/C++ downstream application that doesn't directly link to hdf5 or its dependencies (or link to dependencies that directly link to hdf5/hdf5's dependencies). It can also be used with a Fortran code that is compiled with a recent-ish gfortran version. In other words, it works with the examples shipped with Grackle or simulation codes that don't use an hdf5 output format.
Problems arise if you try to compile a downstream application that links against the precompiled grackle binary and links against an incompatible hdf5 version. In more detail, these problems arise because Linux and most (all?) mainstream unix-like operating systems[^3] construct a global table of symbol-names and it's problematic to be linked two 2 libraries that provide conflicting definitions of a symbol with a given name.
This issue is hardly unique to Grackle. There are 2 options without refactoring (but neither are particularly attractive):
Include everything necessary in the tarball needed to have downstream applications compile against the included version of libhdf5. There are 2 issues:
Distribute Grackle via package managers like apt/dpkg, brew, conda[^4], dnf, etc. (i.e. we don't provide a tarball). In this scenario, we precompile grackle against the version of hdf5 provided by the package manager we would require downstream simulations would try to compile against that same hdf5 version. Disadvantages include:
If we ultimately decide this is something worth doing, we could refactor the source code to better support this scenario. Currently Grackle only has 2 functions that directly reference the HDF5 API; initialize_UVbackground_data
, and initialize_cloudy_data
.[^6] Basically we could move the definitions to one or two different locations:
We could move these 2 function definitions into a separate library -- let's call it libgrackle-h5tools.so
. We would always ship libgrackle-h5tools.so
alongside libgrackle.[so|a]
and we would have libgrackle.so
access these 2 functions by using dlopen
, which lets us avoid any symbol collisions.[^7]
dlopen
described earlier in this thread, this usage would be more robust since we always know the relative path of the shared library it is used to opendlopen
ed library)We could move these 2 function definitions into the public headers and make them inline-functions (so that they are defined within the binary of the simulation-code rather than in grackle). To be able to support fortran bindings, we would need to ship a second version of the grackle-library alongside the main one that fortran codes would explicitly link to (since the fortran codes can't explicitly define inline functions).
[^1]: This is true whether you pre-compile grackle as a shared or static library. I have been
[^2]: At one time, I was more concerned about linking libgfortran.so. But, I have since learned that libgfortran.so employs symbol versioning, like the standard c runtime library on linux (glibc), in order to mitigate ABI compatibility issues. If a downstream application contains Fortran-code, that Fortran code would need to be compiled with gfortran 8 or newer (released 6 years ago), which is reasonable. It's worth mentioning that openmp runtime libraries pose a larger problem than hdf5, but that is a different topic that we can ignore right now.
[^3]: As I understand it, this isn't actually an issue on Windows
[^4]: In this case, there would be separate grackle and pygrackle conda packages
[^5]: I don't know enough about the internals of hdf5, but since grackle doesn't depend on mpi, it's plausible this isn't a major issue.
[^6]: This pending debug-tool introduces another one.
[^7]: The CPython interpretter uses dlopen to open extension modules. This is why we can currently use versions of h5py
and pygrackle
together that were compiled against different (incompatible hdf5 versions).
Once #182 and #208 are merged, I think we will be in a good position to consider packaging (to let people install things without manually building from source). This issue primarily exists to get the discussion going.
This Issue primarily focuses on packaging/distributing
Pygrackle
. It also touches on the idea of packagingGrackle
, itself. The latter idea is mostly discussed as a hypothetical idea -- it is primarily mentioned to highlight that some challenges are common in both scenarios (and that we may want to pick solutions that would help resolve the issue in both scenarios).The biggest challenge in most scenarios is the fact that Grackle requires the hdf5 library. A lesser, common issue is convient distribution of datafiles (that was acknowledged in #210).
Packaging/Distributing
Pygrackle
I think there are 2 primary things we want to do here:
dlopen
with this information to open the hdf5 library at runtime and would initialize the function-pointers in that internal structdlopen
has some weird platform dependent behaviordlopen
Packaging/Distributing
Grackle
This is all a much lower priority than shipping Pygrackle, but it's worth briefly mentioning that we could also distribute precompiled Grackle binaries. I've seen this sort of thing done alongside new software releases for other open-source projects on GitHub.
Since we will already be doing a lot of work to ship Grackle as a precompiled binary alongside Pygrackle, I actually think this wouldn't be hard to do. Moreover, the new CMake build-system should be compatible with the CPack tool.
Again, this is a hypothetical low-priority idea. But, there are 2 reasons I bring this up:
if __name__ == __main__
section that lets the file be installed alongside Grackle as a self-contained, executable script that serves as a command-line tool for fetching data files?[^1][^1]: @matthewturk previously suggested using pooch. Using that module probably wouldn't be ideal for helping in the case of distribution Grackle (without Pygrackle). For portability, it might be better to rely upon functionality from the python standard-library, rather than the pooch package. OR since distributing Grackle itself is mostly a hypothetical, maybe we should use pooch and limit ourselves to a subset of pooch-functionality that we could re-implement later in terms of standard library functionality (if/when it becomes an issue).