Open jswhit2 opened 2 years ago
Relevant netcdf-c issue: https://github.com/Unidata/netcdf-c/issues/2294
I tried installing the hdf5plugin module, and setting HDF5_PLUGIN_PATH to point to the installation directory. This works for zstd, but not for bzip2 and blosc (nc_inq_var_XXX does not recognize them)
setup.py
has been modified to install the plugins (location specified by envar NETCDF_PLUGIN_DIR
) in the package (using data_files
). __init__.py
. then sets HDF5_PLUGIN_PATH
to netCDF4.__path__
. With this, the new compression options should 'just work' with the binary wheels, without the need to point to an external directory.
auditwheel doesn't deal with the plugins correctly so the wheels for 1.6.0 do not include the plugins on linux
It is notoriously difficult to deal with plugin systems including binary dependencies in wheels. This is because wheels have no way to ensure a consistent environment with regard to the surrounding shared libraries coming from the OS and possibly other sources. In the end, this leads to a lot of static linking. This problem is not unique to netcdf4; another big project facing this problem is GDAL, which supports many different formats via a plugin system. Their solution is to provide a pretty bare-bones wheel, and to offer more comprehensive installations in binary aware environments, such as conda environments or operating system package managers.
Could that be a model for netcdf4-python as well? I.e. have basic compression support in the wheel, perhaps only zlib, and include a more comprehensive set of compression options in the conda-forge package?
Right now the wheels have support for extra compression filters, but the compression plugins themselves are not included. If there is a conda-forge netcdf plugin package (or a separate plugin wheel) the plugins should work as long as the plugin path env var is set.
netcdf-c 4.9.0 will have extra compression options based on plugins. The python interface now supports these via a compression kwarg to createVariable. In order to use the extra compression options (beyond zlib) the netcdf-c plugins will need to be installed in HDF5_PLUGIN_PATH. How do we provide this capability in the binary wheels on pypi? These wheels include bundled versions of the C libraries. Some options include: 1) figure out how to include the plugin shared objects in the wheels, and set HDF5_PLUGIN_PATH to point to the directory inside the installation 2) assume the user installs the plugins separately and sets HDF5_PLUGIN_PATH, and just raise an exception if the plugins are not found. 3) create a separate python package that installs the plugins (similar to what h5py does with hdf5plugin).