DLTcollab / sse2neon

A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
MIT License
1.3k stars 208 forks source link

Split off arm64 from arm7 code. #649

Closed alecazam closed 1 month ago

alecazam commented 1 month ago

The current source file is harder to follow than it should be. Splitting off the arm7 code into it's own header would help. Then all those fallbacks can be jettisoned. Apple Silicon, iPhone 5S, and all of our Android and console devices are arm64. This would pave the way towards SVE adoption too.

jserv commented 1 month ago

unifdef appears to be the tool you are looking for. It's a utility specifically designed for removing and simplifying #ifdef directives in source code. Its effectiveness and reliability are demonstrated by its use in maintaining the Linux kernel codebase. See scripts/headers_install.sh

According to the manual:

The unifdef utility selectively processes conditional cpp(1) directives. It removes from a file both the directives and any additional text that they specify should be removed, while otherwise leaving the file alone.

To generate an Arm64-specific header derived from SSE2NEON, you can run the following command:

unifdef -D__aarch64__=1 -D__arm64__=1 sse2neon.h

This command processes the sse2neon.h file, keeping only the code sections relevant to Arm64 architecture by defining both __aarch64__ and __arm64__ as true.

alecazam commented 1 month ago

That worked beautifully! I did have to comment out this, but probably could have just set from the script.

//#if TARGET_OS_MACCATALYST
//#warning - this code won't compile for iOS MacCatalyst, switch target.
//#endif
unifdef -D__aarch64__=1 -D__arm64__=1 -D__clang__=1 sse2neon.h > sse2neon-arm64.h