Closed freakboy3742 closed 1 month ago
macOS universal builds are a special form of cross-compile builds that take advantage of the built-in support in Apple's Xcode
or Command Line Tools for Xcode
for automatic multi-architecture building and linking into multi-arch fat
binaries. One potential gotcha of this approach is when there are architecture-dependent configure tests (in configure
) that depend on the execution of test code on the build machine. Of course, standard cross-compile
builds on macOS and other platforms have the same gotcha. For macOS universal builds, this kind of a problem has occasionally arisen in the past and one solution that was adopted was to provide an additional header file, Include/pymacconfig.h, whose purpose is to move "some of the autoconf magic to compile-time when building on macOS", so overriding problematic configure-time build tests with conditional code. If it turns out that this is that case here (and it seems likely that it would be), it may be possible to replace a problematic autoconf test with code in this include
file.
Note, that this is a potential release blocker issue as the macOS installer binaries we provide with each release are built as a universal2 build. The first 3.14 release, 3.14.0a1, is currently scheduled for 2024-10-15.
Right, except that here it's made even more complicated by the fact that some files should be included in the Intel builds but not the ARM builds. I'm not sure how to achieve that except to special-case the list of files that go into the build with an Apple-specific bit of logic...?
What kind of files and at what point are they used: configuring, building, installing, executing?
The files Modules/_hacl/Hacl_Hash_Blake2s_Simd128.c and Modules/_hacl/Hacl_Hash_Blake2b_Simd256.c should only be compiled on x64/x86. They are then linked into the python executable. They may be used at runtime if the CPU that python is then executed on has the right support.
So, in more detail:
Let me know if I can provide more details? Thanks!
The files Modules/_hacl/Hacl_Hash_Blake2s_Simd128.c and Modules/_hacl/Hacl_Hash_Blake2b_Simd256.c should only be compiled on x64/x86. They are then linked into the python executable. They may be used at runtime if the CPU that python is then executed on has the right support.
Based on this, it sounds like there's no reasonable prospect of getting the Simd128 and Simd256 files to compile on ARM64. AFAIK it's not possible to compile a file for one architecture and link into a universal build - I might be wrong on that, but if I am, I'm going to guess the configure script is going to be messy.
On that basis, it sounds like universal builds won't be able to support those options. That's easy enough to override in the configure process - an extra if block checking for universal on Darwin can disable the option.
I agree it's a little weird that a pure x86-64 build will have support when ARM and universal won't, but given the new platform that the majority of macOS users are on won't support it, I don't think many users will notice the discrepancy.
How about putting an #ifdef in those files that makes them empty when on arm64? Would that help? Like:
#if define(HACL_CAN_COMPILE_SIMD128)
... previous contents of the file ...
#endif
knowing that HACL_CAN_COMPILE_SIMD128 is not defined on ARM64.
@gpshead might have further thoughts
How about putting an #ifdef in those files that makes them empty when on arm64? Would that help? Like:
#if define(HACL_CAN_COMPILE_SIMD128) ... previous contents of the file ... #endif
knowing that HACL_CAN_COMPILE_SIMD128 is not defined on ARM64.
That's the thing though - universal builds are implemented as a single compilation passes, so HACL_CAN_COMPILE_SIMD128/256 is defined. These are turned on because the compiler flags that enable them are legal compiler options when x86_64 is one of the compiler targets, so the constant is defined by the single-pass configure script.
If we were going down the #define
path, it would need to be based on #if defined(__APPLE__) && defined(__arm64__)
, or something like that. However, I think catching this at the configure
level makes more sense, even though it does mean the blake2b simd128/256 implementations won't be available for universal builds running on x86_64, where they otherwise would be.
A potential fix based on an improved autoconf check: #123927
Does macOS really require all files in a universal2 build to be the same? that seems silly. it is common to separate arch specific code into its own files.
editoral comments about their toolchain choices aside (clearly they channeled practicality vs purity there), if that is true, just making the simd files have C preprocessor checks that effectively make them empty when the aarch64 side of compilation is running makes sense. the x86_64 side of the compilation (surely there are two independent compiler passes running behind the scenes of their cc -arch x -arch y
command line) will still be happy and compile useful code instead of an empty object file for those.
If we were going down the #define path, it would need to be based on
#if defined(__APPLE__) && defined(__arm64__)
, or something like that. However, I think catching this at the configure level makes more sense, even though it does mean the blake2b simd128/256 implementations won't be available for universal builds running on x86_64, where they otherwise would be.
I'd actually prefer the #define path. A configure test cannot understand the dual compilation and would unnecessarily leave Intel performance on the table. There are a ton of Intel mac's out there and I assume they'll probably be supported until 2030. It becomes more important in the future if/when we get arch specific accelerated HACL* SHA implementations so that hashlib doesn't need OpenSSL to be performant. (blake2 is less important vs those "Standard"s)
Does macOS really require all files in a universal2 build to be the same? that seems silly. it is common to separate arch specific code into its own files.
Not unless you're compiling multiple architectures in a single pass, as CPython does. If we compiled for x86_64, then compiled for ARM64, then merged the two binaries into a universal binary, the problem wouldn't exist. However, the single-pass autoconf-based build determines the modules to be compiled, and the flags to be passed in to that compile, based on a single pass compiler check.
I'd actually prefer the #define path. A configure test cannot understand the dual compilation and would unnecessarily leave Intel performance on the table. There are a ton of Intel mac's out there and I assume they'll probably be supported until 2030. It becomes more important in the future if/when we get arch specific accelerated HACL* SHA implementations so that hashlib doesn't need OpenSSL to be performant. (blake2 is less important vs those "Standard"s)
Ok - I'll take a look and see what I can make work. My first attempt at doing this failed, but I didn't look too closely at why - I'm probably missing something obvious.
Thanks, I agree that the pass with the #defines sounds better. This can probably be done with a stub file Hacl_Hash_Blake2b_Simd256_Universal.c
that does
#if defined (...)
#include Hacl_Hash_Blake2b_Simd256.c
#endif
so as to leave the ingestion of upstream HACL* unchanged, without having to hack _hacl/refresh.sh in cpython.
Q: do we have a macOS universal2 buildbot?
Q: do we have a macOS universal2 buildbot?
Not that I'm aware of. It's easy enough to add the appropriate options to CPython configure
to trigger a universal2 build. What currently isn't so straightforward is the availability of universal builds of the external third-party libraries that are needed by the standard library and that are not already supplied by macOS: mainly libssl
and libcrypto
from OpenSSL, liblzma
, libmpdec
, Tk
, gdbm
(if GPL-licensing isn't an issue), and potentially newer versions of a few others (sqlite3, ncurses, etc). Homebrew does not provide universal builds (and its decision to use totally different install prefixes for Intel and Apple Silicon builds complicates this). MacPorts does support universal builds of most/all of these but does not provide pre-built universal binaries. At some point soon, we may be able to leverage builds needed for the macOS installer and, possibly, for iOS binaries, as well.
Bug report
Bug description:
99108 tracks the addition of a native HACL implementation to CPython. #119316 added an implementation of Blake2 to
hashlib
.This compiles fine on single architecture macOS builds (as verified by CI); but universal2 builds running on an ARM64 laptop generate a compilation error:
To reproduce the problem: on a macOS machine, configure the build with:
This will eventually yield the compilation error:
From what I can make out, the error comes from the detection of
-mavx2
support. On a bare configure on an ARM64 machine,-mavx2
support is apparently unsupported:and as a result, the
Hacl_Hash_Blake2b_Simd256.c
module isn't compiled. However, when universal support is enabled,-mavx2
is supported:and the module is included. Based on recent configure logs for x86_64 macOS builds, it appears that
-mavx2
is supported on x86_64.I'm not sufficiently familiar with the subject matter to comment on whether the fix here is to fix the autoconf detection to disable the problematic module on universal builds, or to correct the implementation so that it can compile for universal builds.
Tagging @msprotz @R1kM as the authors of the recent HACL* changes.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs