gohugoio / hugoreleaser

Build, archive and release.
Apache License 2.0
11 stars 2 forks source link

A question about manylinux compliance for Hugo binaries #41

Open agriyakhetarpal opened 10 months ago

agriyakhetarpal commented 10 months ago

@bep

Hi! Thanks for setting this tool up. I am trying to build a Python (pip-installable) binary distribution for Hugo at https://github.com/agriyakhetarpal/hugo-python-distributions/ that embeds binaries for the extended version of Hugo for various platforms and architectures, which will be a subset of Hugo's own platform set owing to the lack of tooling for platform tags for BSD-like platforms for Python.

I am using cibuildwheel to build and package these wheels in CI and plan to upload these to PyPI (edit: have done so for Hugo 0.121.1 and 0.121.2), so that one can run the command pip install python-hugo and run commands like hugo version and hugo server --disableFastRender just like they would normally. I know that Python is not the first choice for most packagers and users of Hugo, but with this distribution I wanted to learn if it's possible to embed Go binaries in Python packages, and I have succeeded so far :)

While building Linux amd64 wheels (and arm64/aarch64 wheels under QEMU emulation), I am able to obtain the correct specification for the wheel, but the ABI tags for the wheel are constrained due to manylinux policies (see below). I was wondering if some information can be shared w.r.t the version of GLIBC being used to build Hugo from source, e.g., GLIBC has backward compatibility but not forward compatibility, so I can explore using an older manylinux2014 base Docker image with an older GCC to compile, or I'm not sure if this is coming from the Go toolchain (I am using 1.21.5 since the minimum version was bumped to 1.20 recently).

This is what I get when I run auditwheel, a tool for Linux wheel repair for Python wheels, locally:

Output

``` python_hugo-0.120.4-cp310-cp310-linux_x86_64.whl is consistent with the following platform tag: "linux_aarch64". The wheel references external versioned symbols in these system-provided shared libraries: libgcc_s.so.1 with versions {'GCC_3.0'}, libresolv.so.2 with versions {'GLIBC_2.2.5'}, libdl.so.2 with versions {'GLIBC_2.2.5'}, libm.so.6 with versions {'GLIBC_2.2.5'}, libc.so.6 with versions {'GLIBC_2.4', 'GLIBC_2.2.5', 'GLIBC_2.14'}, libpthread.so.0 with versions {'GLIBC_2.2.5', 'GLIBC_2.3.2'}, libstdc++.so.6 with versions {'CXXABI_1.3', 'GLIBCXX_3.4.18', 'CXXABI_1.3.5', 'GLIBCXX_3.4.11', 'GLIBCXX_3.4.9', 'GLIBCXX_3.4'} This constrains the platform tag to "linux_aarch64". In order to achieve a more compatible tag, you would need to recompile a new wheel from source on a system with earlier versions of these libraries, such as a recent manylinux image. ```

(This is coming from the Hugo unix executable file in the wheel).

On a macOS arm64 machine, this is what I get from otool when I run it on the executable:

Output

``` otool -L hugo-0.120.4-darwin-arm64 hugo-0.120.4-darwin-arm64: /usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2048.1.255) /System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 61040.1.3) /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1600.151.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0) ```

which doesn't help a lot (note that I am mangling the executable name to conform to the platform and architecture rather than lipo-ing it). I tried on a Docker image and received these symbols:

Output from ldd

``` ldd hugo-0.120.4-linux-arm64 linux-vdso.so.1 (0x0000ffff84bc2000) libresolv.so.2 => /lib/aarch64-linux-gnu/libresolv.so.2 (0x0000ffff84b6b000) libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff84b3a000) libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff84962000) libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff848b7000) libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff848a3000) libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff8487f000) libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8470b000) /lib/ld-linux-aarch64.so.1 (0x0000ffff84b92000) ```

I can manually modify the platform tag in the filename(s) to adhere to such compliance, given that the Hugo binary and its commands are working without any issues, but there are risks to this approach and make wheels non-reproducible and non-compliant on older Linux platforms, due to issues such as old system-provided libraries, missing or unresolved symbols, and the like. If the newer versions of the Hugo binaries are supposed to work just on modern systems, then there shouldn't be a problem with manually overwriting these tags – but I thought I should ask around first. Another option I am considering is to use a very old Docker image that comes with a very old GLIBC for this purpose, but building Hugo from source might break in that case. I'm not as familiar with Golang and building packages with it, which is diametrically on the other end in comparison to my experience with Python tooling, and therefore this is a new area for me – I would appreciate a response :))


(P.S. This is a problem on just the Linux wheels I have been trying to compile. I have packaging infrastructure for Windows amd64 wheels and macOS arm64 + amd64 wheels set up—with universal2 wheels too for the latter—I tested the Hugo binary for 0.120.4 manually and it works on Intel MacBooks as old as those released in 2015!)

agriyakhetarpal commented 10 months ago

I hope to have posted this in the right forum for release-related discussion. Lest it is to be flagged as a discussion elsewhere that is more appropriate or visible, please feel free to close this here and direct me to a different location for posting this query. Thanks!

bep commented 9 months ago

@agriyakhetarpal sorry for the late reply, I have to many repositories on my watch list.

What you ask about is a little bit technical even from me. But if I understand the core of your problem is GLIBC version compatibility.

I have struggled with this myself, with build hosts (Vercel, Netlify) that has relatively old Linux build setups.

There are some insights in this thread: https://github.com/gohugoio/hugo/issues/11414#issuecomment-1721455798

Which lead me to maintain my own Docker image for the latest Go version:

https://github.com/bep/dockerfiles/blob/master/ci-go/Dockerfile

Which uses the buster-scm -- which is compatible with Amazon Linux 2.

agriyakhetarpal commented 9 months ago

Hello @bep, I thank you for your response and I do not mind the delay in receiving it. Thanks for the thread you linked as well, could not have found it myself with much ease.

I think Vercel and Netlify have Linux builds that despite being older are still relatively newer than what is defined in the manylinux2014 specification, so a solution to my issue with the specification's compliance would be to rather not go forward with it and choose something newer like manylinux_2_28 for my wheel tags – but this is something I will consider doing in the longer term, since the current implementation does not seem to have issues and I don't think anyone would be running Hugo on a Linux distribution where one would have a too-old GLIBC (Amazon Linux is more recent than manylinux2014 IIUC). Unfortunately, I cannot use the Dockerfile you provided, since it won't contain the versions of Python I use for building the wheels with cibuildwheel. I do install Golang with a small helper bash script and keep its version in sync with the version upstream Hugo is using (the CHANGELOG and the release notes help with this, thanks for keeping them apposite!).

In the meantime before your response and my initial opening of this issue, I was able to retrieve the hugo name on PyPI, add more reliable build procedures and various improvements to the package itself, ensure Hugo's Apache license compliance for all build artifacts (both source and binary distributions) – and I published these pip-installable wheels for Hugo 0.121.2 and the recently released Hugo 0.122.0. I am continuing to build and release wheels for 0.123.X releases (the latest up to 0.123.3, at the time of editing).

I am open to the inclusion of this currently unofficial PyPI package hugo as another suitable and official distribution channel for Hugo via pip besides other package managers such as Chocolatey and Homebrew, and also keep maintaining it after every release – I am happy to liaise with you and other Hugo developers on this regard. If this is a possibility that entices you, please let me know: I will be willing to improve the packaging infrastructure, the documentation I wrote for it, the executable runner, and other repository areas of interest as per your requirements. This will not only enhance the visibility of the package I wrote but also provide Hugo users with an easy-to-install, isolated build of it through popular use cases in Python tooling.


Here is a link to the GitHub repository, whose README has documentation on how isolation works: https://github.com/agriyakhetarpal/hugo-python-distributions/ – essentially, it comes from virtual environments in Python and tools like pipx that can install CLI binaries from the web without the need to keep them permanently (such as pipx run hugo==0.123.2 new site mysite with Hugo 0.123.2, or with pipx run hugo==0.122.0 server for Hugo 0.122.0, or virtually any command and version starting with 0.121.2).

agriyakhetarpal commented 7 months ago

Update: I have started to use the manylinux_2_28 Docker images that correspond to GLIBC version 2.28 for fixing the Hugo wheels, since that is what the Python community is going to migrate to anyway after June 2024 when manylinux_2014/manylinux_2_17 will get deprecated and reach EOL status.

Doing so revealed that an additional manylinux_2_24 wheel tag is applicable, i.e., the Hugo binaries have been linked GLIBC 2.24 or earlier on the latest Go version. This has completely fixed the issue at hand here, and I can now remove any workarounds that I was previously using – while also ensuring proper, robust compliance with the most recent (and older) Linux distributions. In the meantime, I also vastly improved the packaging infrastructure to establish a cleaner install via pip and pipx through both source and binaries. Thank you so much for the help you offered here, @bep! I am more than happy to keep publishing and maintaining these wheels and I am still open to the inclusion of them as an official distribution channel shall you be open to it and if you and the other Hugo developers would like that, as discussed in the comment above.

Please feel free to close this issue, since my initial query has been answered and resolved.