scikit-build / scikit-build-core

A next generation Python CMake adaptor and Python API for plugins
https://scikit-build-core.readthedocs.io
Apache License 2.0
236 stars 47 forks source link

Q: is wheel repairing in the scope of this project? #624

Open wojdyr opened 8 months ago

wojdyr commented 8 months ago

I just spent some time trying to correctly set relative RPATH from CMake, which is non-trivial. This made me wonder: is repairing of issues with dynamic library dependencies (which can be done by tools such as auditwheel, delocate, delvewheel) potentially in the scope of scikit-build-core?

(apologies if it's a wrong place for general question)

LecrisUT commented 8 months ago

Do you mean setting or deleting RPATH? The latter is already done automatically by CMake, but the former is quite tricky because you don't know the installation path at build-time. You can use BUILD_RPATH_USE_ORIGIN and define the RPATHs relative to the wheel directories, but I am not sure how that works on windows systems.

wojdyr commented 8 months ago

I meant setting. Yes, it tricky and differs on different systems. CMake has separate rules for build- and install-time rpaths, BUILD_RPATH_USE_ORIGIN is only for build-time RPATH. On Windows there are no equivalents of rpaths, so dll files need to be copied or perhaps os.add_dll_directory() can be used. Dealing with such things might not be in the scope of this project. Although I've seen that maturin, which is another build backend, contain re-implementation of auditwheel, and repairing macOS and Windows wheels is also planned (issues 1133 and 1134 there).

LecrisUT commented 8 months ago

Oh you're right, actually install rpath is even easier: https://stackoverflow.com/a/58495612. But I don't know how it can deal if you want to point to different folders each.

From what I heard about windows, it is sufficient if the executable (probably also consuming library) are in the same directory as the .dll libraries that it links to. So probably only unix systems need to do these fixes.

But what about doing the audit/repair in cibuild-wheel? It's been a while so I don't remember if auditwheel sets or resets rpaths.

Edit: Although now that I think about it, it would make sense to have the rpath setting either in the CMake or skbuild-core in order to support building from sdist

wojdyr commented 8 months ago

Yes, if shared libraries are in the same directory, it works fine. The answer that you linked would work for Linux. On Mac the equivalent of $ORIGIN would be something like @loader_path or @rpath; on Windows it should just work. The problem is if the files are in different directories. For example, if I have a binary program in bin/, python module in site-packages, and a shared library that's used by both. Or if external libraries are used (I use zlib or zlib-ng). Various workarounds are possible – using static libraries or placing everything in the same directory. Or using relative rpath such as $ORIGIN/../lib on Linux/Mac if the relative path can be predicted. cibuildwheel sorts it out using configurable repair-wheel-command, which is probably most robust.

LecrisUT commented 8 months ago

For example, if I have a binary program in bin/, python module in site-packages, and a shared library that's used by both.

Where would the pip install path be for bin, lib64 be in these cases? Outside of site-packages? Do we already have the ability to do that? Maybe for venv it is not problematic, but when installing outside of one, I can imagine conflicts.

The problem is if the files are in different directories

That can be handled within CMake by setting target-level INSTALL_RPATH + $ORIGIN with relative location. For the windows one, I guess you always have the .dll next to the binary whenever you install. I'm not sure how it works w.r.t. python execution though :shrug:

Or if external libraries are used

Shouldn't these be loaded without RPATH? Probably it would be problematic on spack environments or equivalent where they have to have RPATH

cibuildwheel sorts it out using configurable repair-wheel-command, which is probably most robust.

Yeah, but there are situations where you would also want to have it without rpath, e.g. if you want the python binding to first try to use system installed library before defaulting to bundled one. This could happen on an HPC where the system/module library is optimized but it does not provide bindings.

henryiii commented 8 months ago

You should use cibuildwheel (technically, the contents of repair-wheel-command) to make a redistributable wheel. While you can do some of it in CMake, they do other things, like if you need to bundle in external libs, you would still need auditwheel/develwheel/delocate. They also can verify you don't use disallowed libs, we shouldn't be messing with that or maintaining the list. Auditwheel also produces the correct manylinux tag, while we always produce an unresdistributable linux tag. Etc. These are separate tools each doing a specific job, and we shouldn't trying to absorb their job. (Of course, if for a certain case it's shown specific packages can avoid this, happy to link to the example and even help make it work better. The zig compiler, for example, might allow one to skip manylinux images and cross compile, and would be happy to have examples showing that!)

henryiii commented 8 months ago

Yeah, but there are situations where you would also want to have it without rpath

Ah, yes, that too - scikit-build-core can't know if you are interested in redistribution, while the repair tools know you are preparing for redistribution.

LecrisUT commented 8 months ago

@henryiii, but what about outside wheel distribution, e.g. running pip install when there is no compatible version on PyPI. That one will be using the sdist and re-building from scratch right? In that case there should be some rpath correction when specified, but should it be within CMake or outside?