pypa / auditwheel

Auditing and relabeling cross-distribution Linux wheels.
Other
436 stars 142 forks source link

musllinux mess potential solution #349

Open henryiii opened 2 years ago

henryiii commented 2 years ago

See https://github.com/pypa/cibuildwheel/issues/934 and https://github.com/pypa/manylinux/pull/1225 and https://discuss.python.org/t/a-mess-with-soabi-tags-on-musllinux/11688 - all muslllinux wheels are broken; the version of Python used in musllinux_1_1 makes and uses the wrong binary names inside the wheels (ending in -gnu instead of -musl) In Aline 3.14, they have a fixed version of Python that makes and only reads the new -musl ending. So any wheels produced on musllinux_1_1 do not work past Alpine 3.14.

Current binaries produced:

psycopg_binary-3.0.4-cp310-cp310-musllinux_1_1_x86_64.whl
 - _psycopg.cpython-310-x86_64-linux-gnu.so

What it should be (Alpine 3.14+ has the patched CPython):

psycopg_binary-3.0.4-cp310-cp310-musllinux_1_1_x86_64.whl
 - _psycopg.cpython-310-x86_64-linux-musl.so

I think auditwheel could produce the following workaround:

psycopg_binary-3.0.4-cp310-cp310-musllinux_1_1_x86_64.whl
 - _psycopg.cpython-310-x86_64-linux-musl.so
 - _psycopg.cpython-310-x86_64-linux-gnu.so -> _psycopg.cpython-310-x86_64-linux-musl.so

The solution I'm proposing is: if a wheel is musllinux_1_1, then see if the extension is -gnu. If so, move it to -musl then add a symlink with the old name. For maximum compatibility, these could be split up so that adding a -gnu symlink always happens, regardless of whether this gets fixed in Alpine 3.12 and therefore musllinux_1_1 or not.

Thoughts?

dvarrazzo commented 2 years ago

For me the solution of providing a compatibility symlink is the most lightweight:

I agree with you that releasing an 1_2 standard for this issue is the wrong thing to do.

One issue might be that storing symlinks in zip archives is not supported by the zipfile module (https://bugs.python.org/issue18595, https://bugs.python.org/issue27318) . It is supported by the zip -y command line (both on Alpine and Debian) but the zipfile module will unpack the link as a file, with the path as the content. Which makes me think that after all maybe it's better to fix pip instead and make it create the link on install.

Why zip was chosen as format for the whl archive, instead of tar.gz, goes below my understanding. It must be for compatibility with MSDOS.

henryiii commented 2 years ago

Ouch. Adding a copy of the library would likely not affect the zip size much (duplicated data - at least ideally it wouldn't), but would double the unpacked size.

Can pip do processing like this on install? A wheel install is famously just a copy operation (though pip does process it for creating the pyc files).

Manylinux isn't a "build tool", it's just an environment - producing -musl could possibly be fixed, but not adding symlinks.

Now that SDists are always tar.gz, I don't know why wheels are zips, but yes, probably historical. :)

dvarrazzo commented 2 years ago

Can pip do processing like this on install?

Attacking the problem at pip level seems very simple:

henryiii commented 2 years ago

Hmm, pip install psycopg would then depend on the patch version of CPython? I think pip should still likely make the symlinks too, because the site packages are only gated by Python minor version, not patch version. So upgrading Python patch version to one that fixed this bug would cause the previous install to break.

dvarrazzo commented 2 years ago

I see. However pip is probably still the best tool to create the symlink. The minimal amount to work would be to create a -musl link on condition that tag = manylinux_1_1, file = *-gnu.so. Creating the link in the other direction I don't know if it's useful; it's cheap anyway.

dvarrazzo commented 2 years ago

Auditwheel seems to have no power to solve the problem, unless it can inject a post-install script for pip to handle? If not it would be better to close this and think about the pip solution (https://github.com/pypa/pip/issues/10678).

henryiii commented 2 years ago

There is no "post-install" for wheels, for security and reproducibility reasons. (There's a massive thread on this somewhere).

Auditwheel could still always correct this to -musl. Moving to the correct solution then working on providing a way to keep old software working would be best? However, this can also be fixed in manylinux. I'd wait and let a few experts weigh in.

mayeut commented 2 years ago

From the latest updates in pypa/cibuildwheel#934

  • ALPINE: Add a patch on top of the current patch to make CPython look for -gnu on top of -musl for Alpine 3.15 and 3.14. Reverting the patch would break every Alpine wheel previously locally compiled (like NumPy) and would require rebuilding all shipped packages that depend on Python.
  • CPYTHON: Take the existing patch (bpo-43112: detect musl as a separate SOABI bpo-43112: detect musl as a separate SOABI python/cpython#24502) targeting upstream CPython 3.11 and change search to include abi3-gnu on musl after looking for abi3-musl. The ability to install both binaries into a single folder would be a new "feature" of CPython 3.11.
  • AUDITWHEEL: Optionally this could be checked and normalized by auditwheel (like changing -musl to -gnu on 3.9) if desired. ABI3 wheels targeting <3.11 could be normalized to -gnu.

Now that Alpine is patched & while waiting for CPython 3.11, I think the most pressing issue for auditwheel is to enforce -gnu SOABI & just fail python 3.11+ as long as the upstream patch is not accepted (just to remember something will have to be done but we don't know what yet).

abi3 wheels were not impacted by the patch in Alpine & there's still no way with python/cpython#24502 to tell wether an abi3 module was meant to run on musl or glibc and I'll comment on that over there. abi3 are only searched for with {module}.abi3.so or {module}.so.

All modules are ultimately searched for with {module}.so and that might have been a workaround before Alpine's patch. However, I'm now wondering if this should be checked by auditwheel to respect the "play well with others" clause in both manylinux & musllinux clause.

LecrisUT commented 1 month ago

I've encountered a probably similar issue in https://github.com/Blosc/python-blosc2/actions/runs/10525809286/job/29165344152

      + sh -c 'auditwheel repair -w /tmp/cibuildwheel/repaired_wheel /tmp/cibuildwheel/built_wheel/blosc2-3.0.0b2.dev0-cp310-cp310-linux_x86_64.whl'
  INFO:auditwheel.main_repair:Repairing blosc2-3.0.0b2.dev0-cp310-cp310-linux_x86_64.whl
  Traceback (most recent call last):
    File "/usr/local/bin/auditwheel", line 8, in <module>
      sys.exit(main())
    File "/opt/_internal/pipx/venvs/auditwheel/lib/python3.10/site-packages/auditwheel/main.py", line 54, in main
      rval = args.func(args, p)
    File "/opt/_internal/pipx/venvs/auditwheel/lib/python3.10/site-packages/auditwheel/main_repair.py", line 173, in execute
      out_wheel = repair_wheel(
    File "/opt/_internal/pipx/venvs/auditwheel/lib/python3.10/site-packages/auditwheel/repair.py", line 78, in repair_wheel
      raise ValueError(
  ValueError: Cannot repair wheel, because required library "libc.so.6" could not be located

I thought that musllinux didn't have glibc, but it still has a libc library? I haven't tested if set(CMAKE_C_EXTENSIONS OFF) is relevant there as well.