Closed dazza-codes closed 3 years ago
To preface this comment - the purpose of this packaging hack is to build an AWS lambda layer and minimize the size of the layer to fit as many sci-libs into it as possible. The hack below will break the python venv site-packages for any normal use and should not be used for common venv purposes - it's only used here for illustrative purposes.
It's a nasty hack, but after the package installations are complete, a project test suite passes when using the results of this bash function:
hack_shared_libs () {
site=$1
mkdir -p "${site}/shared_libs"
if [ -d "${site}/rasterio" ]; then
export GDAL_DATA="${site}/shared_libs/gdal_data"
export PROJ_DATA="${site}/shared_libs/proj_data"
mv "${site}"/rasterio/gdal_data "${site}/shared_libs/"
mv "${site}"/rasterio/proj_data "${site}/shared_libs/"
for d in $(find "${site}" -type d -name 'gdal_data'); do
if [ "$d" != "$GDAL_DATA" ]; then
rsync -auq "$d"/ "$GDAL_DATA"/
rm -rf "$d"
#ln -s "$GDAL_DATA" "$d"
fi
done
for d in $(find "${site}" -type d -name 'proj_data'); do
if [ "$d" != "$PROJ_DATA" ]; then
rsync -auq "$d"/ "$PROJ_DATA"/
rm -rf "$d"
#ln -s "$PROJ_DATA" "$d"
fi
done
export SHARED_LIBS="${site}/shared_libs/libs"
mkdir -p "${SHARED_LIBS}"
rsync -auq "$site"/rasterio.libs/ "$SHARED_LIBS"/
rm -rf "$site"/rasterio.libs
ln -s "$SHARED_LIBS" "$site"/rasterio.libs
rsync -auq "$site"/Fiona.libs/ "$SHARED_LIBS"/
rm -rf "$site"/Fiona.libs
ln -s "$SHARED_LIBS" "$site"/Fiona.libs
fi
}
It still contains duplicate libs because the libs are almost the same but they have different file names:
$ ls -l /tmp/tmp_venv_3nAFAw/lib/python3.6/site-packages/shared_libs/
total 12
drwxr-xr-x 2 joe joe 4096 Oct 30 18:53 gdal_data
drwxr-xr-x 2 joe joe 4096 Oct 30 18:53 libs
drwxr-xr-x 2 joe joe 4096 Oct 30 18:53 proj_data
$ ls -1 /tmp/tmp_venv_3nAFAw/lib/python3.6/site-packages/shared_libs/libs/
total 69660
-rwxr-xr-x 1 joe joe 35656 Oct 30 18:37 libaec-f0d4887b.so.0.0.10
-rwxr-xr-x 1 joe joe 3532904 Oct 30 18:37 libcurl-ea538880.so.4.4.0
-rwxr-xr-x 1 joe joe 3532912 Oct 30 18:37 libcurl-fiona-ea538880.so.4.4.0
-rwxr-xr-x 1 joe joe 222320 Oct 30 18:37 libexpat-09c47d4c.so.1.6.8
-rwxr-xr-x 1 joe joe 172944 Oct 30 18:37 libexpat-fiona-c4a93fc7.so.1.6.8
-rwxr-xr-x 1 joe joe 23787528 Oct 30 18:37 libgdal-044c25e5.so.20.5.4
-rwxr-xr-x 1 joe joe 21884960 Oct 30 18:37 libgdal-fiona-9fe15c06.so.20.5.4
-rwxr-xr-x 1 joe joe 323632 Oct 30 18:37 libgeos_c-a68605fd.so.1.13.1
-rwxr-xr-x 1 joe joe 323640 Oct 30 18:37 libgeos_c-fiona-a68605fd.so.1.13.1
-rwxr-xr-x 1 joe joe 2240704 Oct 30 18:37 libgeos--no-undefined-b94097bf.so
-rwxr-xr-x 1 joe joe 2240712 Oct 30 18:37 libgeos--no-undefined-fiona-b94097bf.so
-rwxr-xr-x 1 joe joe 4236544 Oct 30 18:37 libhdf5-4377e0cf.so.103.1.0
-rwxr-xr-x 1 joe joe 186152 Oct 30 18:37 libhdf5_hl-92c1cdd8.so.100.1.2
-rwxr-xr-x 1 joe joe 342720 Oct 30 18:37 libjpeg-3fe7dfc0.so.9.3.0
-rwxr-xr-x 1 joe joe 342720 Oct 30 18:37 libjpeg-fiona-3fe7dfc0.so.9.3.0
-rwxr-xr-x 1 joe joe 58800 Oct 30 18:37 libjson-c-5f02f62c.so.2.0.2
-rwxr-xr-x 1 joe joe 58808 Oct 30 18:37 libjson-c-fiona-5f02f62c.so.2.0.2
-rwxr-xr-x 1 joe joe 1822440 Oct 30 18:37 libnetcdf-07221d8a.so.13.1.1
-rwxr-xr-x 1 joe joe 205616 Oct 30 18:37 libnghttp2-11cb20b8.so.14.17.1
-rwxr-xr-x 1 joe joe 205624 Oct 30 18:37 libnghttp2-fiona-11cb20b8.so.14.17.1
-rwxr-xr-x 1 joe joe 378776 Oct 30 18:37 libopenjp2-8f6da918.so.2.3.0
-rwxr-xr-x 1 joe joe 281944 Oct 30 18:37 libpng16-898afbbd.so.16.35.0
-rwxr-xr-x 1 joe joe 281952 Oct 30 18:37 libpng16-fiona-898afbbd.so.16.35.0
-rwxr-xr-x 1 joe joe 453488 Oct 30 18:37 libproj-cd06b982.so.12.0.0
-rwxr-xr-x 1 joe joe 453488 Oct 30 18:37 libproj-fiona-cd06b982.so.12.0.0
-rwxr-xr-x 1 joe joe 1421520 Oct 30 18:37 libsqlite3-bc0a2dd7.so.0.8.6
-rwxr-xr-x 1 joe joe 1259400 Oct 30 18:37 libsqlite3-fiona-25a4bc97.so.0.8.6
-rwxr-xr-x 1 joe joe 18760 Oct 30 18:37 libsz-53d02de5.so.2.0.1
-rwxr-xr-x 1 joe joe 783120 Oct 30 18:37 libwebp-fbd93615.so.7.0.5
-rwxr-xr-x 1 joe joe 85656 Oct 30 18:37 libz-a147dcb0.so.1.2.3
-rwxr-xr-x 1 joe joe 85664 Oct 30 18:37 libz-fiona-a147dcb0.so.1.2.3
It might help if the -fiona-
were dropped from the lib names (although there may be good reasons for that to actually avoid library version conflicts or something). When fiona
is using basically the same version of any library that is also used in rasterio (e.g. libz
), this hacked consolidation into a shared-libs might work; e.g. if libz-a147dcb0.so.1.2.3
is binary equivalent to libz-fiona-a147dcb0.so.1.2.3
and they both used the same lib-name, the shared-libs hack should result in just one of these files in the site-packages. This is a nasty hack because it's done after the installation - it would be better if some kind of shared-libs dependency could be used in the CI systems and packaging for any project that requires it so that CI and packaging for rasterio and fiona could be tested against it and rely on it for package distributions.
Some details depend on the linking (absolute vs. relative) for the .so. libs. The shapely libgeos gets broken by the same hack, e.g.
# this one is from shapely
$ ldd /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shared_libs/libs/libgeos_c-a68605fd.so.1.13.1
linux-vdso.so.1 (0x00007ffc90b6b000)
libgeos--no-undefined-b94097bf.so => not found
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f86577e4000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8657446000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8657055000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f8656e3d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8657dc3000)
# it is OK while still in the shapely package installation
$ ldd /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shapely/.libs/libgeos_c-a68605fd.so.1.13.1
linux-vdso.so.1 (0x00007ffdeb936000)
libgeos--no-undefined-b94097bf.so => /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shapely/.libs/./libgeos--no-undefined-b94097bf.so (0x00007fb2a0abe000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb2a0735000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb2a0397000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb29ffa6000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb29fd8e000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb2a1138000)
The fiona libgeos
seems to move OK to preserve relative links:
$ ldd /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shared_libs/libs/libgeos_c-fiona-a68605fd.so.1.13.1
linux-vdso.so.1 (0x00007ffc633f1000)
libgeos--no-undefined-fiona-b94097bf.so => /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shared_libs/libs/./libgeos--no-undefined-fiona-b94097bf.so (0x00007fd23c5f1000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fd23c268000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd23beca000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd23bad9000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fd23b8c1000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd23cc6b000)
The failure to resolve some symbols can be hacked by setting LD_LIBRARY_PATH, e.g. the following works OK:
$ export LD_LIBRARY_PATH=/tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shared_libs/libs
$ find /tmp/tmp_venv_Z9YFAp/lib/python3.6/site-packages/shared_libs/libs/ -iname "*.so.*" | while read lib_name; do ldd -r "$lib_name" 2>&1; done
We're not going to do this. I think combining rasterio and fiona is a more practical solution. And I'm not ready to take that on either.
See https://github.com/OSGeo/gdal/issues/3060
Exactly what a shared-libs solution could be is not entirely clear, but it would need to provide the opportunity for rasterio/fiona to use them in CI/CD systems. Ideally, it might provide something like a pip optional extra pattern for installation of rasterio version X along with binary manylinux wheels for gdal version A, B or C (i.e. a complete matrix that would be consistent with CI/CD matrix builds).
e.g. duplicate binary libs when rasterio is installed along with fiona, pyproj, shapely and other sci-libs:
libgeos
libgdal
libcurl
libsqlite3
libz