python-pillow / Pillow

Python Imaging Library (Fork)
https://python-pillow.org
Other
12.32k stars 2.23k forks source link

Isolate macOS wheel builds from Homebrew #8497

Closed freakboy3742 closed 4 days ago

freakboy3742 commented 4 weeks ago

The existing macOS cibuildwheel configuration relies on Homebrew to provide some dependencies. This clearly works in practice, but there are some issues associated with this choice.

Firstly, it is destructive when used on a local build machine. The wheel dependency script invokes brew remove --ignore-dependencies (which can leave the host machine in a broken, and potentially difficult to restore state); and requires the use of sudo to make changes in the /usr/local tree. This means it is difficult to test cibuildwheel changes locally, or to recommend using cibuildwheel as a local build solution.

Secondly, Homebrew builds could be incompatible with Pillow builds. The MACOSX_DEPLOYMENT_TARGET for Homebrew dependencies is fixed by the Homebrew build chain, rather than the Pillow build tools, so a change in Pillow's configuration may not be satisfied by Homebrew's build configuration. This isn't an issue at present, but it could easily become one in future.

Related to this, there is also the potential that a change in a Homebrew recipe could lead to the inadvertent introduction of additional binary libraries into the delocated macOS binaries. This is especially problematic with the fribidi, raqm and freetype dependedencies which are included as "vendor" sources.

However, the real motivation for proposing this change is that it is a precursor for PEP 730-compliant iOS builds. If Homebrew is on the build path when compiling iOS binaries, compiler tooling will often find an ARM64 Homebrew binary and attempt link it into an iOS library - which then (predictably) breaks. So - Homebrew isolation is essential for iOS compilation. Since the build processes for iOS and macOS are very similar, and Homebrew isolation was necessary for iOS, adding Homebrew isolation for macOS will simplify any future iOS patch. It also provides a way to submit iOS support in a series of smaller pieces, rather than a single "monster" patch. Plus, it provides all the side benefits of isolation described above.

An overview of the changes:

freakboy3742 commented 4 weeks ago

The manylinux failures look like they'll be fixed by this PR to multi build.

The x86_64 macOS failures are probably a leakage from /usr/local that I haven't accounted for. I'm looking into that one now.

I'm not sure how to interpret the coverage drop, though...

freakboy3742 commented 4 weeks ago

@radarhere I've updated the pin for multibuild to include the (just merged) PR reverting the libjpeg-turbo CMAKE_INSTALL_LIBDIR change; and I've addressed some additional locations where Homebrew can leak in on x86_64.

nulano commented 4 weeks ago

Related to this, there is also the potential that a change in a Homebrew recipe could lead to the inadvertent introduction of additional binary libraries into the delocated macOS binaries.

This has already happened many times in the past, so thanks for working on this.

Adds builds of fribidi and raqm, rather than using the Homebrew versions as "vendored" versions. The setup.py configuration passed in by cibuildwheel has been modified from the default on other platforms to not use the vendor version.

I'm not sure if I understand what you mean here. We need to use a vendored version of raqm in wheels for license reasons. When using raqm=vendor, we build our own version of raqm from https://github.com/python-pillow/Pillow/tree/main/src/thirdparty/raqm. Similarly, using fribidi=vendor does not actually build fribidi, but instead custom code that can load fribidi at runtime, see https://github.com/python-pillow/Pillow/tree/main/src/thirdparty/fribidi-shim. The Homebrew version of fribidi is only used for testing the built wheels.

freakboy3742 commented 3 weeks ago

Adds builds of fribidi and raqm, rather than using the Homebrew versions as "vendored" versions. The setup.py configuration passed in by cibuildwheel has been modified from the default on other platforms to not use the vendor version.

I'm not sure if I understand what you mean here. We need to use a vendored version of raqm in wheels for license reasons. When using raqm=vendor, we build our own version of raqm from https://github.com/python-pillow/Pillow/tree/main/src/thirdparty/raqm. Similarly, using fribidi=vendor does not actually build fribidi, but instead custom code that can load fribidi at runtime, see https://github.com/python-pillow/Pillow/tree/main/src/thirdparty/fribidi-shim. The Homebrew version of fribidi is only used for testing the built wheels.

Thanks - that's some helpful context. I was seeing the missing fribidi detection during the build and interpreting that as something that was part of the move away from homebrew; but I see now that Pillow is loading from Homebrew by design in this instance. That also might explain the coverage drop (as supplying fribidi means any runtime loading code won't be in use). I'll take another swing at that part (and resolve the linux build issues as well).

Out of interest - what's the license issue with raqm? The source code indicates It looks to be MIT licensed... did it used to be GPL (or something else problematic?)

hugovk commented 3 weeks ago

Out of interest - what's the license issue with raqm? The source code indicates It looks to be MIT licensed... did it used to be GPL (or something else problematic?)

From https://github.com/python-pillow/Pillow/pull/2753 in 2017:

We can't currently distribute a binary of Pillow with support for raqm due to license incompatibility, because while libraqm is MIT licensed, it links to GPL libraries.

FriBiDi is LGPL.

Since then, Raqm added support for Apache-licensed SheenBiDi in 2021: https://github.com/HOST-Oman/libraqm/issues/138, https://github.com/HOST-Oman/libraqm/pull/139.

Perhaps we could switch.

freakboy3742 commented 3 weeks ago

@radarhere After a lot of false starts (apologies for the noise, BTW...), I think I've finally got it all sorted.

  1. macOS is isolated from Homebrew on both x86_64 and ARM64
  2. The shim fribidi builds is used in the wheel, with Homebrew providing the fribidi binary on macOS in the test environment
  3. Linux builds are now correctly using BUILD_PREFIX, and don't require the lib64 copying step.
freakboy3742 commented 3 weeks ago

I'm not sure I understand how coverage can go down when you don't touch any code or tests... and the codecov report isn't really helping explain what is going on.

hugovk commented 3 weeks ago

Hmm, let's see... (Short version: I think it's some problem with Codecov and not this PR.)

Codecov details Looking at the coverage report at https://github.com/python-pillow/Pillow/pull/8497/checks?check_run_id=32242849880 it says "90.36% (-3.98%) compared to 2d1d801" which is already suspicious: our coverage is around 90%, not 94%. That 2d1d801 is the last commit on `main`: "Update CHANGES.rst [ci skip]" That "[ci skip]" did in fact skip the CI, but I see @radarhere later triggered a manual build of that commit at https://github.com/python-pillow/Pillow/actions/runs/11573601676 for only Windows, presumably checking the PyPy failure was also happening on `main` and not just on (another) failing PR. Anyway, this Windows-only CI resulted in a skewed 94% coverage report at https://app.codecov.io/github/python-pillow/Pillow/commit/2d1d801ec032bfb760bede88da6dcaab5ff3c75e (see the flags on the right only show Windows and not the others). All of which suggests this is nothing to worry about. Having said that, these jobs ran about 9 hours ago and @freakboy3742's comment is about 14 hours ago, so this digging doesn't reflect the situation 14 hours ago! I will say that we've been seeing drops in other PRs where it doesn't quite make sense. I've not had time to investigate, I wonder if we're not setting tokens properly for some builds after Codecov changed their v4 action to be more strict.
freakboy3742 commented 3 weeks ago

@hugovk /me casually drops this article into the chat :-)

FWIW - BeeWare ditched Codecov a couple of years ago specifically because of weird inconsistencies like this one; and although the output of a raw coverage report is marginally less pretty (i.e., no annotated code listings), you can get the same report locally that you get in CI, and you get none of the weird outages and reporting issues that seem to plague Coverage. YMMV; but if there's interest, it's a relatively straightforward switch, especially if coverage reporting is already in place.

freakboy3742 commented 3 weeks ago

The (seemingly inconsistent) Windows/pypy3.10 test failure also fascinates me... is this the same cache issue mentioned by #8513?

hugovk commented 3 weeks ago

The (seemingly inconsistent) Windows/pypy3.10 test failure also fascinates me... is this the same cache issue mentioned by #8513?

Yep, we added a new licence (well, old: MIT CMU) to OSI and Trove classifiers, so that test needs the new trove-classifiers. https://github.com/python-pillow/Pillow/issues/7942 has the details.

I've opened https://github.com/python-pillow/Pillow/pull/8514 as an alternative to https://github.com/python-pillow/Pillow/pull/8513.

nulano commented 2 weeks ago
  • The shim fribidi builds is used in the wheel, with Homebrew providing the fribidi binary on macOS in the test environment

Thanks, that looks better.


There still seems to be something wrong with freetype on arm64, https://github.com/python-pillow/Pillow/actions/runs/11604707482/job/32314276332?pr=8497#step:5:14308

  SKIPPED [2] Tests/test_imagefont.py:1081: FreeType compiled without brotli or WOFF2 support

But it was built with brotli, https://github.com/python-pillow/Pillow/actions/runs/11604707482/job/32314276332?pr=8497#step:5:7768:

    brotli:        yes (pkg-config)

We need to support this for #6554.


We could also consider compiling libwebp before libtiff so that WEBP-compressed tiff files can be read (AFAIK webp only looks for libtiff so that it can compile companion tools, but does not actually use it in the library - as I mentioned in #6562). But we don't need to do it now since the test was already being skipped on macOS and Linux before this PR. https://github.com/python-pillow/Pillow/actions/runs/11604707482/job/32314276332?pr=8497#step:5:15489:

  SKIPPED [1] Tests/test_file_libtiff.py:905: WEBP compression support is not configured
freakboy3742 commented 2 weeks ago

There still seems to be something wrong with freetype on arm64, https://github.com/python-pillow/Pillow/actions/runs/11604707482/job/32314276332?pr=8497#step:5:14308

Weird. My immediate guess is that it might have something to do with the fact that a vendored version of fribidi is being used, which might not be triggering brotli support (or at least not triggering the test). I'll take a look and see what is going on.

We could also consider compiling libwebp before libtiff so that WEBP-compressed tiff files can be read (AFAIK webp only looks for libtiff so that it can compile companion tools, but does not actually use it in the library - as I mentioned in #6562). But we don't need to do it now since the test was already being skipped on macOS and Linux before this PR. https://github.com/python-pillow/Pillow/actions/runs/11604707482/job/32314276332?pr=8497#step:5:15489:

I saw that, and ultimately gave up trying to fix it - firstly because I was trying to limit the scope of the changes, but also because I'm not sure it can be fixed. libwebp has a dependency on tiff... but then lcms2 needs tiff; and openjpeg needs libpng, tiff, and lcms2. So - I'm not sure this can be fixed (or, at least, it can't be fixed without some sort of 2-pass compile and the use of dynamic linking, which iOS can't use anyway).

nulano commented 2 weeks ago

Weird. My immediate guess is that it might have something to do with the fact that a vendored version of fribidi is being used, which might not be triggering brotli support (or at least not triggering the test). I'll take a look and see what is going on.

The test skip is triggered by a specific error raised by the freetype library which itself does not use fribidi.

libwebp has a dependency on tiff... but then lcms2 needs tiff; and openjpeg needs libpng, tiff, and lcms2

AFAIK all of these dependencies are just for the binary utilities that we don't need so we can disable them with relevant compile flags; the libraries themselves do not have dependencies between them (except for libtiff optionally using libwebp). But I agree that it might be best to leave that for a separate PR.

freakboy3742 commented 2 weeks ago

@nulano I've found the culprit - Homebrew's copy of freetype was leaking into the test environment.

By using DYLD_LIBRARY_PATH pointed at Homebrew's lib foldler, all of Homebrew's libraries become available, and take take precedence over the ones provided by the wheel. As a result, Homebrew's copy of freetype is used, which doesn't include Brotli support. I presume this wasn't an issue previously because an explicit DYLD_LIBRARY_PATH comes earlier in library resolution precedence; when Homebrew was being used implicitly, Homebrew's library resolve after Pillow's bundled dependencies.

This can be fixed by pointing the DYLD_LIBRARY_PATH at the cellar version of the vendored fribidi library (which is the only library we actually need from Homebrew). I've pushed that update (update: I'm looking at the linting error now...)

The update includes 2 other small changes:

  1. The location of the "homebrew isolated" macOS build is slightly altered - the extra darwin folder became necessary when I started working on the iOS build, and the build folders for macOS started colliding with iOS.
  2. I've added some safety checks against some uses that are valid cibuildwheel invocations, but will break if you try to use them. Specifically, CIBW_ARCHS="arm64 x86_64", and not specifying CIBW_ARCHS at all, are both valid cibuildwheel setup, but Pillow doesn't allow it because the before_all step requires a single explicit architecture. This bit me a couple of times before I worked out what was happening; I figured it couldn't hurt to raise an explicit error when we know it won't work.
nulano commented 2 weeks ago

Thanks, looks good to me now.

radarhere commented 6 days ago

Thanks for your patience. This is getting close - aside from my latest comments, I'm happy with the current state.

freakboy3742 commented 4 days ago

@radarhere No problem at all - thanks for all the reviews. I've just pushed an update that I think addresses all your comments.

aclark4life commented 4 days ago

Thanks @freakboy3742 @radarhere @nulano !

hugovk commented 4 days ago

Thank you!