matthew-brett / delocate

Find and copy needed dynamic libraries into python wheels
BSD 2-Clause "Simplified" License
262 stars 59 forks source link

Exclude argument not excluding library #207

Open Jake-Moss opened 4 months ago

Jake-Moss commented 4 months ago

Describe the bug When providing the --exclude argument to exclude libarrow and libarrow_python from the wheel delocate.libsana still claims it is unable to find the library instead of ignoring it.

To Reproduce This fails within our CI using pypa/cibuildwheel@v2.16.5, specifically our repair-wheel-command is

[tool.cibuildwheel.macos]
archs = ["x86_64"]
environment = { CC="gcc-12", CXX="g++-12" }
repair-wheel-command = "delocate-wheel -vv --exclude arrow --require-archs {delocate_archs} -w {dest_dir} -v {wheel}"

I've tried plenty of variations of --exclude arrow, --exclude libarrow --exclude libarrow_python and such, the docs and #106 lead me to believe that this should be excluding based on substring presence.

Expected behavior Similarly to auditwheel, the libarrow and libarrow_python should be excluded from the repair attempt.

Wheels used I don't have access to a MacOs system to recreate the wheel but the relevant branch is https://github.com/AequilibraE/aequilibrae/pull/510/ We're attempting to link against the pyarrow Cython and arrow C++ APIs using Cython.

Platform:

Additional context I've attached the full logs of a failed CI run, the relevant section is Build wheels or the Build wheels on macos-latest/4_Build wheels.txt file. Here's a small section as well

2024-02-27T00:09:05.7946960Z ERROR:delocate.libsana:
2024-02-27T00:09:05.7947870Z @rpath/libarrow_python.dylib not found:
2024-02-27T00:09:05.7950530Z   Needed by: /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp9dd8xlza/wheel/aequilibrae/paths/route_choice.cpython-39-darwin.so
2024-02-27T00:09:05.7951850Z   Search path:
2024-02-27T00:09:05.7952360Z     @loader_path
2024-02-27T00:09:05.7953250Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12/gcc/x86_64-apple-darwin21/12
2024-02-27T00:09:05.7954320Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12/gcc
2024-02-27T00:09:05.7955080Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12
2024-02-27T00:09:05.7957100Z ERROR:delocate.libsana:@rpath/libarrow_python.dylib not found, requested by /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp9dd8xlza/wheel/aequilibrae/paths/route_choice.cpython-39-darwin.so
2024-02-27T00:09:05.7958850Z ERROR:delocate.libsana:
2024-02-27T00:09:05.7959460Z @rpath/libarrow.1500.dylib not found:
2024-02-27T00:09:05.7960930Z   Needed by: /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/tmp9dd8xlza/wheel/aequilibrae/paths/route_choice.cpython-39-darwin.so
2024-02-27T00:09:05.7962320Z   Search path:
2024-02-27T00:09:05.7962830Z     @loader_path
2024-02-27T00:09:05.7963700Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12/gcc/x86_64-apple-darwin21/12
2024-02-27T00:09:05.7964620Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12/gcc
2024-02-27T00:09:05.7965350Z     /usr/local/Cellar/gcc@12/12.3.0/lib/gcc/12

logs_11438.zip

HexDecimal commented 4 months ago

Looks like exclusion is performed too late in the process of finding a library. It's done after rpaths are resolved which causes the error seen here.

I think copy_filt_func should probably be called in the tree_libs or get_dependencies function. In fact, it should probably be moved there entirely.

mattip commented 3 weeks ago

We may have come across this in scipy. We have this situation:

I think copy_filt_func should probably be called in the tree_libs or get_dependencies function. In fact, it should probably be moved there entirely.

I don't know the code base. Is this something I could easily try?

HexDecimal commented 3 weeks ago
  • A "bare" call of delocate complains that these shared objects use two libgfortran shared objects: the one from the gfortran compiler and the one from openblas.

  • Using -exclude "solves" the complaint, but does not properly -change the dylib loader command in arpack: it leaves the full build path of libgfortran in place. While dlocate does pack a libgfortran into the .dylibs directory of the wheel, the arpack shared object does not see that one.

This sounds like a complex case slightly unrelated to this issue. It sounds correct that --exclude would skip the libgfortran dylib and then not know how to deal with the path to it afterwards. What is your expectation for this situation?

If the extra paths you're referring to are rpaths then --sanitize-rpaths will remove any absolute and relative paths from arpack and any other bundled dylibs, leaving only the special @loader_path paths. This might resolve the issue when combined with --exclude, but I'm not completely sure.

Otherwise the dylibs need to link to the same libgfortran before they are delocated. Either by compiling them to point to a single library in the first place or by modifying the dylibs with install_name_tool before running Delocate.

mattip commented 3 weeks ago

What is your expectation for this situation?

Right, when trying to write in words the algorithm, I came to the conclusion that you are correct: -exclude is not the right option for the scipy use case.

modifying the dylibs with install_name_tool before running Delocate

This was the solution we chose.

Sorry for hijacking this issue.

mattip commented 3 weeks ago

I am still willing to try to make a PR to solve this issue, if help is needed.