google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
4.12k stars 315 forks source link

Added support for dynamic dispatch for macOS/iOS/iPadOS on AArch64 #2152

Closed johnplatts closed 4 months ago

johnplatts commented 4 months ago

Added support for dynamic dispatch on macOS/iOS/iPadOS on AArch64 as some Apple Silicon CPU's have support for the ARM BF16 extension.

jan-wassenberg commented 4 months ago

FYI internal iOS tests are failing due to svqxtnb_u16 apparently lacking an sve2 attribute. I am investigating.

jan-wassenberg commented 4 months ago

@johnplatts would you please rebase this one?

jan-wassenberg commented 4 months ago

Thanks for rebasing! Unfortunately internal CI is failing with a compiler crash for convert_test, but possibly also for all other targets.

clang: error: clang frontend command failed due to signal (use -v to see invocation)
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-ios17.4

I think that corresponds to what we call HWY_COMPILER_CLANG=1600. Are you able to repro this? Should we disable dynamic dispatch, or more likely NEON_BF16, until even more recent compilers? FYI #2170 raises the minimum to 1600.

jan-wassenberg commented 4 months ago

Indeed still breaking. I am patching the PR to disable runtime dispatch for Clang < 17, and fix the detection - Apple Clang 15.3 was mistakenly found to be 17 when it should be 16.

jan-wassenberg commented 4 months ago

I have patched this PR as mentioned :)