python / cpython

The Python programming language
https://www.python.org
Other
63.36k stars 30.34k forks source link

Unvendor `libmpdec` sources #115119

Open zware opened 9 months ago

zware commented 9 months ago

To facilitate cleaner updates of the externally-maintained libmpdec required by the _decimal module, we should migrate away from the bundled copy in Modules/_decimal/libmpdec and towards an "external" in cpython-source-deps for Windows and --with-system-libmpdec by default elsewhere.

My tentative plan is as follows:

Linked PRs

erlend-aasland commented 9 months ago

[...] with a fallback to the bundled copy (maybe with a warning?)

+1 for an autoconf warning.

ned-deily commented 9 months ago

Adding an external libmpdec to the macOS installer build shouldn't be a big issue. Assign it to me if you proceed with this.

erlend-aasland commented 9 months ago

FTR: the source deps repo is now updated.

erlend-aasland commented 9 months ago

@ned-deily: for build-installer.py, is there more to do than adding a configuration for mpdecimal? I've used the following configure options: ['--disable-cxx', 'MACHINE=universal']

ned-deily commented 9 months ago

Likely yes, I'll take care of it. Thanks.

ned-deily commented 8 months ago

I took a quick look at this and be aware that this change proposal will likely impact everyone building Python 3.13 on macOS. macOS does not ship a copy of libmpdec so, if the bundled version is removed, anyone building Python on macOS will now need to provide a local copy, either building from source or using a version from one of the widely-used third-party package managers, like Homebrew or MacPorts. Both currently have mpdecimal packages at the 4.0.0 level. (Recall that, unlike Windows builds, we do not currently provide non-vendored third-party library binaries for macOS users, other than those built for and linked into the Pythons provided by python.org installers for macOS.) In a quick test, it appeared that with ./configure --with-system-libmpdec today, pkg-config info for libmpdec was not being used. If so, that should be corrected. And, in any case, the devguide instructions for installing dependencies will need to be updated to cover libmpdec. We should add all this and get this working before moving on to changing the installer.

erlend-aasland commented 8 months ago

In a quick test, it appeared that with ./configure --with-system-libmpdec today, pkg-config info for libmpdec was not being used. If so, that should be corrected.

Last time I checked, libmpdec did not provide pkgconfig files:

$ ls $(brew --prefix libmpdec)/lib/pkgconfig
ls: /opt/homebrew/opt/mpdecimal/lib/pkgconfig: No such file or directory
$ cd mpdecimal-2.5.1; find . | grep pkg
$
ned-deily commented 8 months ago

With the current 4.0.0 version, it appears both Homebrew and MacPorts ship them:

$ pkg-config libmpdec --cflags
-I/opt/homebrew/Cellar/mpdecimal/4.0.0/include
$ pkg-config libmpdec --libs
-L/opt/homebrew/Cellar/mpdecimal/4.0.0/lib -lmpdec -lm

...

$ pkg-config libmpdec --libs
-L/opt/macports/lib -lmpdec -lm
erlend-aasland commented 8 months ago

Yes, the 4.0.0 version does indeed provide pkg-config configuration files. I'll work on an Autoconf patch.

erlend-aasland commented 6 months ago

@zware are we still aiming for 3.13 for the first batch of items?

erlend-aasland commented 6 months ago

In a quick test, it appeared that with ./configure --with-system-libmpdec today, pkg-config info for libmpdec was not being used. If so, that should be corrected.

Resolved by:

zware commented 6 months ago

@zware are we still aiming for 3.13 for the first batch of items?

I would like to, but I don't know when I'm going to have a chance to get back to this.

ned-deily commented 6 months ago

There is an issue with unvendoring on macOS that will affect some users (and this may be something you ran into when testing, @erlend-aasland). For some reason, mpdecimal, for both 2.5.1 and 4.0.0, chooses to define @rpath-prefixed names for the macOS dynamic library install name for the shared libraries it builds. This can cause builds on macOS to fail in some cases unless additional action is taken.

By default, the libmpdec build produces both shared and a static libraries. On macOS, the shared library install name looks like this:

$ otool -L libmpdec.dylib
libmpdec.dylib:
    @rpath/libmpdec.4.dylib (compatibility version 4.0.0, current version 4.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

Note the @rpath. Typically, shared libraries built on macOS have an absolute path by default when installed, for example:

$ otool -L libsqlite3.dylib
libsqlite3.dylib:
    /opt/local/lib/libsqlite3.0.dylib (compatibility version 9.0.0, current version 9.6.0)
    /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.12)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

An @rpath install name can be useful if you are building libraries that can be installed in arbitrary locations or are being embedded in something else but using an @rpath-based name requires additional configuration when building, like adding an rpath parameter to the linking of main executables.

To further cloud the issue, it appears that Homebrew, when building and installing its mpdecimal package, converts the @rpath/libmpdec.4.dylib install name to a conventional absolute path name:

$ otool -L /usr/local/Cellar/mpdecimal/4.0.0/lib/libmpdec.dylib
/usr/local/Cellar/mpdecimal/4.0.0/lib/libmpdec.dylib:
    /usr/local/opt/mpdecimal/lib/libmpdec.4.dylib (compatibility version 4.0.0, current version 4.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)

and, at the moment, MacPorts does not:

$ otool -L /opt/macports/lib/libmpdec.dylib
/opt/macports/lib/libmpdec.dylib:
    @rpath/libmpdec.4.dylib (compatibility version 4.0.0, current version 4.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)

Also, like most similar packages providing libraries, mpdecimal by default installs both a shared library and a static (.a) library but the linker will normally always prefer a shared library when both are present in the same directory. Both Homebrew and MacPorts install both.

AFAICT, this leaves the following scenarios as of now when building 3.13 with --with-system-libmpdec enabled:

  1. If no third-party or private copy of libmpdec is available, the Python build fails with:
./Modules/_decimal/_decimal.c:38:10: fatal error: 'mpdecimal.h' file not found
#include <mpdecimal.h>
         ^~~~~~~~~~~~~
1 error generated.
  1. If building with a current MacPorts copy of libmpdec or if building with a private copy that was built without explicitly disabling its shared library option (--disable-shared), the Python build will fail with something like:
[ERROR] _decimal failed to import: dlopen(/path/to/build/lib.macosx-14.4-x86_64-3.13/_decimal.cpython-313-darwin.so, 0x0002): Library not loaded: @rpath/libmpdec.4.dylib
  Referenced from: <A8A8EE60-CE66-309E-A108-472100134C51> /path/to/source/Modules/_decimal.cpython-313-darwin.so
  Reason: no LC_RPATH's found

Following modules built successfully but were removed because they could not be imported:
_decimal
  1. If building with a current Homebrew copy of libmpdec, the Python build succeeds (because Homebrew removed the @rpath in the library install name) and links with the Homebrew installed shared libmpdec, not the static one.

Now there are various workarounds that I can think of to get around the situations where you are using an environment that contains a pre-built libmpdec with the default @rpath-named shared library, like if you are using MacPorts to supply all of the necessary third-party libs. But I can't think of any that doesn't involve some sort of kludge that requires more configuration options and/or creating alternate library directories. While probably most core developers building on macOS today use Homebrew, I believe there are some using MacPorts (besides me) and there are certainly downstream users in this situation as well as the MacPorts project itself. This segment of users could be addressed if the MacPorts project would opt to do what Homebrew is doing and remove the @rpath install name. That still leaves any other users building Python and its third-party libraries from scratch on macOS; they will be affected initially one way or another although they can adopt strategies like ensuring no libmpdec shared library is available to their builds.

The big question I have is how best to document this to minimize the impact on the various categories of users. At this point, I don't have good answers and I feel I can't spend any more time on this prior to feature code cutoff. But we shouldn't impose this change on users until we can fully explain its impact and how to workaround it.

ned-deily commented 6 months ago

(Opened MacPorts issue: https://trac.macports.org/ticket/69870)

erlend-aasland commented 6 months ago

Thank you for chiming in, Ned! I'll look into how to improve the docs here.

Even with the extra week before the feature freeze, it's going to be hard to land this nicely. I may find some time for this later this week, but I can't (and won't, and should not) promise anything.

There is an issue with unvendoring on macOS that will affect some users (and this may be something you ran into when testing, @erlend-aasland)

Yes, this may explain my issues from some weeks ago.

If no third-party or private copy of libmpdec is available, the Python build fails with [...]

For this case, we should consider marking _decimal as a missing module instead of failing. After all, this is what we do with similar extension modules.

ryandesign commented 6 months ago

(Opened MacPorts issue: https://trac.macports.org/ticket/69870)

I've merged the fix in MacPorts and reported the problem to the developer of mpdecimal by email; maybe they will fix it in an upcoming release and then we won't have to patch it anymore.

ned-deily commented 6 months ago

I've merged the fix in MacPorts and reported the problem to the developer of mpdecimal by email; maybe they will fix it in an upcoming release and then we won't have to patch it anymore.

Thanks, @ryandesign! I've verified that the patched MacPorts library now works as expected.

ned-deily commented 6 months ago

BTW, unvendoring will also affect other platforms builds, in particular, the iOS build would now need to supply a libmpdec (FYI @freakboy3742).

freakboy3742 commented 6 months ago

@ned-deily Thanks for the heads up. Reading back on the history - is the intention here that libmpdec will be shipped as a dynamic library, or statically linked into _decimal (or that will be left to platforms to decide)?

Also - tagging @mhsmith for the implications on Android. A quick search of the Android SDK didn't find libmpdec, but I might have missed something.

mhsmith commented 6 months ago

Yes, the Android build would need to supply its own copy as well, which should be easy enough.

Our approach on Android has been to use dynamic linking for libraries which are large, or which third-party packages are likely to link against (SQLite and OpenSSL), and static linking for everything else. libmpdec would probably be in the static group.

ned-deily commented 6 months ago

The current vendored version is linked statically with _decimal but dynamic linking works too if the @rpath issue is dealt with. I don't think there's any particular preference. I'd probably stick with static.

erlend-aasland commented 6 months ago

@freakboy3742, @mhsmith: can you use the copy we provide in the cpython-sources-deps repo?

erlend-aasland commented 6 months ago

The big question I have is how best to document this to minimize the impact on the various categories of users.

For most users, the devguide instructions will suffice. There is already a PR up for this. For users who build their own mpdecimal library, we could ...:

  1. add an item to the "Building" section of the What's New
  2. add some text to Mac/README.rst and/or Mac/BuildScript/README.rst; the item of 1) could point to these resources
mhsmith commented 6 months ago

can you use the copy we provide in the cpython-sources-deps repo?

Yes, probably.

ned-deily commented 6 months ago

can you use the copy we provide in the cpython-sources-deps repo?

Yes, probably.

Why would we want to do that? AFAIK, the cpython-sources-deps repo was created for and has only ever been used for Windows builds and may contain Windows-related patches. It's likely easier to use the original sources anyway.

zware commented 6 months ago

Currently, we're not carrying any patches on anything in cpython-source-deps (that I can remember :)). When we do, we tag a special version with those patches, keeping a vanilla tag in place as well.

ned-deily commented 6 months ago

Regardless, why create a new dependency with no apparent benefit? We've never used the cypthon-source-deps for anything other than Windows builds, say, for example, for macOS installer builds. Why start now?

zware commented 6 months ago

It's just an available option which is under our control and not subject to the whims of external hosts other than GitHub.

freakboy3742 commented 6 months ago

can you use the copy we provide in the cpython-sources-deps repo?

For iOS: almost certainly not without some patching. It's a configure-based project, so at a minimum it will need a patch to support the iOS and iOS-simulator host tags. That might be as simple as updating config.sub (the current version from CPython would do the job), but I'd need to dig in to be certain.

I've tried multiple times to get the iOS tags (especially the simulator ones) upstream into autoconf; I was partially successful with the 2024-01-01 release, but it's still missing the simulator tags. The autoconf contribution process appears to be a black hole... you email them a patch, and maybe they apply that patch at some point in the future, so you watch the repo to see if your patch has been applied. I haven't even received acknowledgement that I've sent a patch, let alone feedback on the patch.

erlend-aasland commented 6 months ago

[...] The autoconf contribution process appears to be a black hole... you email them a patch, and maybe they apply that patch at some point in the future, so you watch the repo to see if your patch has been applied. I haven't even received acknowledgement that I've sent a patch, let alone feedback on the patch.

We can solve this with https://discuss.python.org/t/pinning-gnu-autotools-using-the-github-container-registry/47437 I'm aiming to complete that build container during the PyCon sprints.

freakboy3742 commented 6 months ago

[...] The autoconf contribution process appears to be a black hole... you email them a patch, and maybe they apply that patch at some point in the future, so you watch the repo to see if your patch has been applied. I haven't even received acknowledgement that I've sent a patch, let alone feedback on the patch.

We can solve this with https://discuss.python.org/t/pinning-gnu-autotools-using-the-github-container-registry/47437 I'm aiming to complete that build container during the PyCon sprints.

While I agree the devcontainer approach is worth pursing, unless I'm missing something, devcontainers won't address what I'm talking about here.

The underlying issue is that third party libraries that have autoconf configuration won't support iOS out of the box, because autoconf doesn't have full support for iOS. As a result, any autoconf library that you want to use under iOS will require a patch of the config.sub file in that library's sources to add that support. We're talking about libmpdec here, but I currently have to patch xz, freetype, libpng, and libjpeg in order to support CPython builds and the limited set of binary wheels that I'm currently maintaining. Every binary library using autoconf will have the same problem.

This is a problem that needs autoconf to accept a patch upstream to add iOS simulator support, so that third-party packages can update their config.sub to an official version that has simulator support, rather than needing to apply the same custom patch to every config.sub that has been vendored into a library's sources.

erlend-aasland commented 6 months ago

[...] Every binary library using autoconf will have the same problem.

Got it; thanks for the clarification! Well, let's hope your upstream patch makes it to the next GNU Autoconf release 🤞

skirpichev commented 5 months ago

It seems, that due to https://bugs.debian.org/1056785 - more projects now use _pydecimal implementation, that gets battle-tested. Here is an example of a failure: https://github.com/aleaxit/gmpy/actions/runs/9153743194/job/25163188162 due to difference in the tp_name: "decimal.Decimal" vs "Decimal".

we-are-all-individuals commented 5 months ago

Reaching out to the Debian CPython maintainer @doko42. It would be very nice to make mpdecimal available again in Debian. According to https://www.bytereef.org/mpdecimal/changelog.html the Sphinx issues no longer exist in version 4.0.0.