conda-forge / cdt-builds

conda-forge cdt builds
BSD 3-Clause "New" or "Revised" License
4 stars 22 forks source link

libudev from systemd-libs-*-cos7 is overlinked, results in dynamically loading incompatible libraries from system #48

Closed ryanvolz closed 3 years ago

ryanvolz commented 3 years ago

This is a fun one. It started with this import error after building the gnuradio package for linux-aarch64 and then attempting to do the same with the gnuradio-osmosdr package:

import: 'gnuradio.qtgui'
Traceback (most recent call last):
  File "/drone/src/build_artifacts/gnuradio-osmosdr_1621138893275/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.6/site-packages/gnuradio/qtgui/__init__.py", line 19, in <module>
    from .qtgui_python import *
ImportError: /lib64/libz.so.1: version `ZLIB_1.2.9' not found (required by /drone/src/build_artifacts/gnuradio-osmosdr_1621138893275/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/lib/python3.6/site-packages/gnuradio/qtgui/../../../../././libpng16.so.16)

So importing some Qt stuff was loading libpng16.so.16, which in turn didn't find its required ZLIB_1.2.9 (the version provided by the conda-forge package) because it was loading /lib64/libz.so.1 from the system instead of from the conda environment. Eventually I tried setting LD_DEBUG=libs and saw the following relevant snippets. Early on we see:

      3886: find library=libz.so.1 [0]; searching
      3886:  search path=/usr/lib64/elfutils        (RUNPATH from file /lib64/libdw.so.1)
      3886:   trying file=/usr/lib64/elfutils/libz.so.1
      3886:  search cache=/etc/ld.so.cache
      3886:   trying file=/lib64/libz.so.1

followed later by:

      3886: find library=libpng16.so.16 [0]; searching
      3886:  search path=$PREFIX/lib/python3.8/site-packages/PyQt5/../../../.:$PREFIX/lib/python3.8/site-packages/PyQt5/../../..        (RPATH from file $PREFIX/lib/python3.8/site-packages/PyQt5/../../../libQt5Core.so.5)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/PyQt5/../../.././libpng16.so.16

and then the error:

      3886: /lib64/libz.so.1: error: version lookup error: version `ZLIB_1.2.9' not found (required by $PREFIX/lib/python3.8/site-packages/PyQt5/../../.././libpng16.so.16) (fatal)

So the system path is coming from libdw.so.1. Searching the logs for that exposes the true culprit, libudev.so.1:

      3886: find library=libudev.so.1 [0]; searching
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libsndfile.so.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././././libudev.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././.       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libgnuradio-blocks.so.3.9.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libudev.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../libgnuradio-osmosdr.so.0.2.0)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libudev.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../..       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/osmosdr_python.cpython-38-aarch64-linux-gnu.so)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../libudev.so.1
      3886:  search path=$PREFIX/bin/../lib     (RPATH from file python)
      3886:   trying file=$PREFIX/bin/../lib/libudev.so.1
      3886:  search cache=/etc/ld.so.cache
      3886:   trying file=/lib64/libudev.so.1
      3886: 
      3886: find library=libcap.so.2 [0]; searching
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libsndfile.so.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././././libcap.so.2
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././.       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libgnuradio-blocks.so.3.9.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libcap.so.2
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../libgnuradio-osmosdr.so.0.2.0)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libcap.so.2
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../..       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/osmosdr_python.cpython-38-aarch64-linux-gnu.so)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../libcap.so.2
      3886:  search path=$PREFIX/bin/../lib     (RPATH from file python)
      3886:   trying file=$PREFIX/bin/../lib/libcap.so.2
      3886:  search cache=/etc/ld.so.cache
      3886:   trying file=/lib64/libcap.so.2
      3886: 
      3886: find library=libdw.so.1 [0]; searching
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libsndfile.so.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././././libdw.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././.       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libgnuradio-blocks.so.3.9.1)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../././libdw.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../.     (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/../../../libgnuradio-osmosdr.so.0.2.0)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../.././libdw.so.1
      3886:  search path=$PREFIX/lib/python3.8/site-packages/osmosdr/../../..       (RPATH from file $PREFIX/lib/python3.8/site-packages/osmosdr/osmosdr_python.cpython-38-aarch64-linux-gnu.so)
      3886:   trying file=$PREFIX/lib/python3.8/site-packages/osmosdr/../../../libdw.so.1
      3886:  search path=$PREFIX/bin/../lib     (RPATH from file python)
      3886:   trying file=$PREFIX/bin/../lib/libdw.so.1
      3886:  search cache=/etc/ld.so.cache
      3886:   trying file=/lib64/libdw.so.1

To summarize: libudev.so.1, provided by the systemd-libs CDT for cos7, pulls in both the libdw.so.1 and libcap.so.2 libraries from the system, which in turn leads to /lib64/libz.so.1 being used instead of the conda-forge libz and resulting in incompatibilities when other libraries expect the conda-forge version.

That led me to the question of why this was happening with the CentOS 7 CDT and not earlier with the CentOS 6 CDT (libudev-devel). It looks like this is a casualty of bringing libudev into the systemd tree and build system; before that (cos6), libudev didn't link to libcap.so.2 and libdw.so.1 at all, and after that it did. Moreover, the most recent systemd versions have corrected this so that libudev no longer includes those links. Judging by systemd's build system source code through this transition, it seems like libudev was never intended to make those links as it doesn't actually use them, and they were included by mistake by virtue of being used elsewhere in systemd. As far as I can tell, none of the newer fixed versions of systemd have made it to CentOS 7. The current systemd-libs-cos7-*-219 CDT provides libudev.so.1.6.2, and from at least libudev.so.1.6.11 (from systemd-libs-239, CentOS 8 rpm) the overlinking problem has been fixed. This is confirmed with ldd where those libraries are no longer present, and moreover we can see those unused dependencies (and more?):

$ ldd -u -r libudev.so.1.6.2
Unused direct dependencies:
        /lib/x86_64-linux-gnu/librt.so.1
        /lib/x86_64-linux-gnu/libcap.so.2
        /lib/x86_64-linux-gnu/libm.so.6
        /usr/lib/x86_64-linux-gnu/libdw.so.1
        /lib/x86_64-linux-gnu/libdl.so.2
        /lib64/ld-linux-x86-64.so.2
$ ldd -u -r libudev.so.1.6.11

I managed a temporary workaround for this by adding the elfutils (for libdw.so.1) and libcap packages as host dependencies of gnuradio. In that case, loading libudev.so.1 finds the conda-provided versions of those libraries and so the conda version of libz is also used.

I think it is likely that others will encounter this problem whenever conda-forge transitions to CentOS 7, so I'd like to see if there is a better solution. I don't imagine that using a CentOS 8 RPM that has a fixed libudev would work or be the right solution. Maybe it would be possible to replace the libudev portion of the systemd-libs CDT with a proper conda-forge package. The best solution might be to modify the current libudev.so.1.6.2 during the CDT packaging and strip the unused libraries. Something like:

patchelf --remove-needed libcap.so.2 --remove-needed libdw.so.1 --output libudev.so.1.6.2.fixed libudev.so.1.6.2

I could probably come up with a PR that does that, if that would be acceptable and assuming it would work. Let me know if I've missed anything and what the best path forward might be. Thanks!

beckermr commented 3 years ago

We can add custom code to strip the links in the CDT builds. We'll need to implement support for per CDT build number bumps which has been on my list, but not done yet.

However, I'd first like the rest of @conda-forge/core to weigh in on this.

kkraus14 commented 3 years ago

Using patchelf to strip the dependencies sounds like a reasonable fix to me until there's builds we can access.

Defer to others as far as how to handle per CDT bumps as I'm not familiar with the current CDT machinery.

beckermr commented 3 years ago

Defer to others as far as how to handle per CDT bumps as I'm not familiar with the current CDT machinery.

Yep. This task is on me.

isuruf commented 3 years ago

Changing the CDT will not help as they are used only for building and this issue is about loading libraries.

beckermr commented 3 years ago

Hrrrmmmmmm. Yeah you are right, but I am confused.

I managed a temporary workaround for this by adding the elfutils (for libdw.so.1) and libcap packages as host dependencies of gnuradio. In that case, loading libudev.so.1 finds the conda-provided versions of those libraries and so the conda version of libz is also used.

Maybe this isn't a workaround but the right solution? We should prefer conda packages to CDTs anyways.

isuruf commented 3 years ago

There are 2 options

  1. use conda packages instead of CDTs
  2. link in libz.so to qtgui_python before libudev so that the conda zlib is loaded first and keep your fingers crossed that conda zlib is newer than system zlib.
ryanvolz commented 3 years ago

Thanks everyone! I'll think about making a proper conda package for libudev. It didn't seem like it would be too crazy when I was checking out their build setup...

beckermr commented 3 years ago

I am going to close this issue for now. I have merged changes to the code to allow individual build number bumps in the future.