Closed mattip closed 4 years ago
I encountered the same page alignment issues in my project at https://github.com/intersystems/iknow. For me, forcing auditwheel to use patchelf --page-size=65536
was enough to fix the problem and resulted in binaries that I could load on RHEL7, which uses 64KiB pages. Currently, as a workaround, I am executing the following before invoking auditwheel in a manylinux2014_aarch64 container.
mv /usr/local/bin/patchelf /usr/local/bin/_patchelf
printf '#!/usr/bin/env bash\n\n_patchelf --page-size 65536 "$@"\n' >> /usr/local/bin/patchelf
chmod +x /usr/local/bin/patchelf
This workaround inserts the --page-size=65536
argument whenever auditwheel invokes patchelf
.
https://github.com/numpy/numpy/issues/16677#issuecomment-649809124 suggests that patchelf cannot fully fix this. (And from what I can tell, the --page-size
argument is not intended for attempting to change the page size). However this change should still be made so that patchelf does not break correctly-built binaries.
Well. https://github.com/pyca/bcrypt/commit/8d35d8acb82b4600a367e692d1231a705552900c is what we did for bcrypt. It sounds like auditwheel can't solve this, and maybe we need to throw this into the CFLAGS off the docker image?
we need to throw this into the CFLAGS of the docker image
This sounds right to me. So this should be an issue/PR against the manylinux2014 ARM image to add export CFLAGS="$CFLAGS -Wl,-z,max-page-size=0x10000"
, correct? Can we move this issue to that repo?
That seems correct to me, but I'm far from an expert.
On Wed, Aug 19, 2020 at 11:06 AM Matti Picus notifications@github.com wrote:
we need to throw this into the CFLAGS of the docker image
This sounds right to me. So this should be an issue/PR against the manylinux2014 ARM image to add export CFLAGS="$CFLAGS -Wl,-z,max-page-size=0x10000", correct? Can we move this issue to that repo?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pypa/auditwheel/issues/251#issuecomment-676483763, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBHGFI6SFZVUABBUVOLSBPS7VANCNFSM4OHDNAMA .
-- All that is necessary for evil to succeed is for good people to do nothing.
auditwheel would ideally still check and complain if the alignment is wrong.
Throwing it in the CFLAGS of the docker image is a fragile fix since many projects may unknowingly clobber existing CFLAGS with their own. Python appends the flags it was built with to module compilation in addition to whatever CFLAGS are passed, so if we compile Python itself on the manylinux2014 images with -Wl,-z,max-page-size=0x10000
then it will reliably be passed for everything...unless I'm missing something 😄
I think https://github.com/pypa/manylinux/blob/manylinux2014/docker/build_scripts/build_utils.sh#L31 is roughly where that change would happen.
You could also potentially hack the compiler on the manylinux2014 images to enforce that (maybe by modifying the linker scripts?).
Rather than hacking the compiler, would it be possible to write a script to validate that produced binaries have the expected minimum alignment?
So there are (at least) two issues: creation and validation. Both are important.
Copy of https://github.com/numpy/numpy/issues/16677#issuecomment-676693957 comment:
So the issue is coming from Manylinux2014_aarch64 and its version of patchelf being slightly too old.
https://github.com/NixOS/patchelf/commit/0470d6921b5a3fe8e92e356c8e11d120dbbb06c0 This patch was submitted June 20th, and fixes the issues with AArch64. ManyLinux2014 is pulling the 0.11 tag release which is 3 weeks older. Updating ManyLinux2014 should fix this.
Originally posted by @geoffreyblake in https://github.com/numpy/numpy/issues/16677#issuecomment-676693957
and @mattip reply https://github.com/numpy/numpy/issues/16677#issuecomment-676735511 :
Thanks. Let's continue the discussion on the manylinux issue, since I think we have localized the problem to patchelf breaking binaries that are properly compiled with 64k pages, and patchelf is part of manylinux.
I've tested the latest numpy wheels for Python 3.6 and 3.8 on aarch64 machine and everything seems to work well.
LOAD sections also seem to be aligned correctly.
# readelf -l venv/lib/python3.8/site-packages/numpy/core/_multiarray_umath.cpython-38-aarch64-linux-gnu.so
Elf file type is DYN (Shared object file)
Entry point 0x298a0
There are 10 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000038ab08 0x000000000038ab08 R E 0x10000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_EH_FRAME 0x0000000000330830 0x0000000000330830 0x0000000000330830
0x000000000000bd44 0x000000000000bd44 R 0x4
LOAD 0x000000000038e600 0x000000000039e600 0x000000000039e600
0x000000000001e194 0x000000000003e7b8 RW 0x10000
TLS 0x000000000038e600 0x000000000039e600 0x000000000039e600
0x0000000000000000 0x0000000000000140 R 0x8
GNU_RELRO 0x000000000038e600 0x000000000039e600 0x000000000039e600
0x0000000000001a00 0x0000000000001a00 R 0x1
LOAD 0x0000000000400000 0x00000000003e0000 0x00000000003e0000
0x00000000000039e8 0x00000000000039e8 RW 0x10000
NOTE 0x00000000004039c0 0x00000000003e39c0 0x00000000003e39c0
0x0000000000000024 0x0000000000000024 R 0x4
DYNAMIC 0x0000000000410000 0x00000000003f0000 0x00000000003f0000
0x0000000000000220 0x0000000000000220 RW 0x8
LOAD 0x0000000000410000 0x00000000003f0000 0x00000000003f0000
0x0000000000002e70 0x0000000000002e70 RW 0x10000
I think we can close this now.
Closing. Please reopen or open a new issue as needed.
Aarch64 systems are configured with either 4k or 64k pages. A system with 64k pages cannot load a shared object that is built with 4k pages. It seems the manylinux arm64 docker images use 4k pages, while RHEL8/AArch6 use 64k pages, so manylinux2014 arm64 wheels build in the docker image cannot be used across systems. In numpy/numpy#16677, @vstinner suggested using the
patchelf --page-size=65536
argument to change the section alignment. Would this be enough to fix the problem, or do we need to change the docker images to add-Wl,-z,max-page-size=0x10000
by default to the compiler?