google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
3.97k stars 308 forks source link

add NEON_BF16 target #2148

Closed copybara-service[bot] closed 2 months ago

copybara-service[bot] commented 2 months ago

add NEON_BF16 target

johnplatts commented 2 months ago

I have made some changes to hwy/detect_targets.h and hwy/targets.cc in a separate branch at https://github.com/johnplatts/jep_google_highway/commit/53044f34dab33822e28d06893a50f3f233b6ec8a to enable runtime dispatch on macOS/iOS/iPadOS on AArch64.

I was also able to ensure that the changes I made would compile successfully on macOS on AArch64 using a custom GitHub workflow that builds Google Highway on an Apple Silicon Mac with an Apple M1 CPU.

The changes that I made to hwy/detect_targets.h and hwy/targets.cc to enable runtime dispatch on macOS/iOS/iPadOS on AArch64 are dependent on the changes made in this pull request.

jan-wassenberg commented 2 months ago

Thanks, that sounds great - I believe M2 would support NEON_BF16. Your changes look good. We will land this PR after thorough internal testing - it's a risky change because bf16 has tickled several compiler bugs, so we are running all tests/configs.