scikit-hep / awkward

Manipulate JSON-like data with NumPy-like idioms.
https://awkward-array.org
BSD 3-Clause "New" or "Revised" License
806 stars 81 forks source link

chore: update matrix to macOS-latest #3164

Open ianna opened 6 days ago

ianna commented 6 days ago

The macOS-11 environment is deprecated and will be removed on June 28th, 2024. Currently, all tests on MacOS-11 are cancelled:

Run Tests (macos-11, 3.8, x64, full)
This is a scheduled macOS-11 brownout. The macOS-11 environment is deprecated and will be removed on June 28th, 2024.
Run Tests (macos-11, 3.11, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.9, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.10, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.12, x64, full)
This is a scheduled macOS-11 brownout. The macOS-11 environment is deprecated and will be removed on June 28th, 2024.
Run Tests (macos-11, 3.12, x64, full)
GitHub Actions has encountered an internal error when running your job.
ianna commented 6 days ago

Updating to macOS-latest:

File "/Users/runner/work/awkward/awkward/tests/test_1345_avro_reader.py", line 19 in test_int


Installed versions
  Version ~3.8.0-0 was not found in the local cache
  Version ~3.8.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.8.18-9599280229/python-3.8.18-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/50616bfe-b731-46a1-a80b-aa4f625178eb -f /Users/runner/work/_temp/3c64bc7c-fd38-489e-9e10-f25640cb8b27
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.8.18 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[2322]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <76EC6AAE-B1A7-382D-B14F-55446445181E> /Users/runner/hostedtoolcache/Python/3.8.18/x64/bin/python3.8
    Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  Error: ./setup.sh: line 54:  2322 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134

Installed versions
  Version ~3.9.0-0 was not found in the local cache
  Version ~3.9.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.9.19-9599861319/python-3.9.19-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/18f35be1-b42d-4514-a112-6f7bc53cc3f0 -f /Users/runner/work/_temp/d68c0f92-cb89-47e6-a09f-06a471be2d67
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.9.19 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[2312]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <64474517-EFC0-32F5-93D6-1C4BAE8783F9> /Users/runner/hostedtoolcache/Python/3.9.19/x64/bin/python3.9
    Reason:
  Error: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  ./setup.sh: line 54:  2312 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134
Installed versions
  Version ~3.10.0-0 was not found in the local cache
  Version ~3.10.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.10.14-9599980810/python-3.10.14-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/83746de3-e758-41ea-b109-892583cdb238 -f /Users/runner/work/_temp/4b092a0a-dd2e-410e-95ca-72eb7a5eb343
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.10.14 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[1955]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <09857011-94D0-3FBA-9F9D-9FCE0E7366FF> /Users/
  Error: runner/hostedtoolcache/Python/3.10.14/x64/bin/python3.10
    Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  ./setup.sh: line 54:  1955 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134
ianna commented 5 days ago
[159/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthInputBuffer.cpp.o
[160/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/builder/UnionBuilder.cpp.o
[161/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/util.cpp.o
[162/172] Building CXX object CMakeFiles/awkward-cpu-kernels.dir/src/cpu-kernels/awkward_sort.cpp.o
[163/172] Linking CXX shared library libawkward-cpu-kernels.dylib
[164/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthOutputBuffer.cpp.o
...
[168/172] Building CXX object CMakeFiles/_ext.dir/src/python/io.cpp.o
[169/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthMachine.cpp.o
[170/172] Linking CXX shared library libawkward.dylib
[171/172] Building CXX object CMakeFiles/_ext.dir/src/python/forth.cpp.o
[172/172] Linking CXX shared module _ext.cpython-38-darwin.so
ld: warning: -undefined dynamic_lookup may not work with chained fixups
jpivarski commented 5 days ago

By updating from macos-11, you're seeing the issue that Angus punted on in https://github.com/scikit-hep/awkward/pull/2869/commits/0d83c86d0c9b3bc414e4b35ffb2758691154dffa.

@henryiii, there was a time several months ago when GitHub Actions made a big jump in MacOS version, and some things were broken as a result. I think this was one of them: the linker wasn't right, and therefore libawkward.dylib symbols didn't get properly linked into _ext.*.dylib. (The Avro failure is just the canary in the coalmine: it's the first use of the _ext extension module, through AwkwardForth.) Is this familiar? Do you know what's happening here?

ianna commented 5 days ago

By updating from macos-11, you're seeing the issue that Angus punted on in 0d83c86.

@henryiii, there was a time several months ago when GitHub Actions made a big jump in MacOS version, and some things were broken as a result. I think this was one of them: the linker wasn't right, and therefore libawkward.dylib symbols didn't get properly linked into _ext.*.dylib. (The Avro failure is just the canary in the coalmine: it's the first use of the _ext extension module, through AwkwardForth.) Is this familiar? Do you know what's happening here?

Thanks! I thought it looked familiar :-)

Yes, it's a linker issue. The -undefined dynamic_lookup flag allows the linker to defer symbol resolution until runtime, which can be problematic if the required symbols are not found. This is often used in Python extensions to allow them to be dynamically loaded. Chained fixups are a newer feature in macOS that optimize the way dynamic libraries are loaded. However, they may not always work correctly with -undefined dynamic_lookup.

Unfortunately, my laptop is MacOS 11.6 and I'm planning to upgrade it to 12 this weekend so that I could test if we need to explicitly export the symbols and check if the CMakeLists.txt is set up correctly to handle symbol visibility and dynamic linking.

jpivarski commented 5 days ago

My MacOS is 14.5 and I just ran another installation: no compilation issues, linker issues, or dynamic loading issues. All of the tests pass.

(Major versions 11, 12, and 14 seem pretty far apart. I checked on endoflife.date/macos and MacOS 11 was dropped by Apple last September.)

I think we were using an old Mac version because it works on all versions from the version used for compilation onward, and we're therefore covering all versions that are still in service. I don't know why it would fail to compile on GitHub's MacOS 11 and not on my MacOS 14.

henryiii commented 5 days ago

Apple was trying to move away from -undefined dynamic_lookup, but it was integral to how modules for languages like Python worked, so it's still valid, AFAIK, and I believe it just disables chained fixups. Older compilers may throw warnings and not work as well. There is a way to do the chained fixups for Python extensions, but it's involved (you have to process the Python binary and build a file with a symbol table, I think) and isn't something we've ever added to pybind11's CMake infrastructure. Wenzel does have it in nanobind's CMake code, so it is possible.

macos-latest (and macOS-14) are ARM, while macos-13 and before is Intel, on GHA.

There are some issues with newer macOS versions supporting older ones around AVX instructions, I think, but it's not a linker issue, I think it was just a bug with the newest compilers at one point.

henryiii commented 5 days ago

Have you tried macos-13?

ianna commented 5 days ago

Have you tried macos-13?

I have tried the latest that was defaulted to 13.

henryiii commented 5 days ago

latest is 14 now, has been for a few weeks. That would be much faster, but is also a bigger change (Apple Silicon). 13 seems to segfault.

ianna commented 5 days ago

latest is 14 now, has been for a few weeks. That would be much faster, but is also a bigger change (Apple Silicon). 13 seems to segfault.

See my comments above. I think, it was 13 😀

ianna commented 4 days ago

@jpivarski and @henryiii - the problem was an architecture mismatch - we requested x86, but the macos-latest nodes are arm64, so the actions were downloading incompatible python libraries. It was masked by the fact that the gettext location (as installed by homebrew) is different on the newer architectures. It looks like the actions did not use the environment variables I tried to define. The architecture error became apparent only after a link to the expected location was added.

The remaining problem is the avro test segfault and that may also be related the wrong architecture (because all runs well on Jim's laptop ;-)

The bottom line is that we should go for macos-latest. As I understand from this GitHub blog macos-14 becomes macos-latest together with macos-11 retirement - expected to complete by June 2024.

henryiii commented 4 days ago

macos-12 and macos-13 are Intel. And the transition of macos-latest to macos-14 (Apple Silicon) was completed a couple of weeks ago.

jpivarski commented 4 days ago

Let's go with macos-latest.

Does this mean that we're not building awkward-cpp for Intel Macs anymore? That would only be okay if they're not supported by Apple. (People with Intel Macs would have to use old versions of Awkward Array, but they're on unsupported hardware, so what can we do?)

henryiii commented 4 days ago

We should still be building them with cibuildwheel. I think this is only the native testing. And they are still supported by Apple, the latest operating systems are being released for Intel, minus a few features.