intel / ARM_NEON_2_x86_SSE

The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using SIMD up to AVX2 intrinsic functions
Other
430 stars 149 forks source link

Add some __aarch64__ functions #10

Closed MaskRay closed 6 years ago

Zvictoria commented 6 years ago

Hi, MaskRay. Thanks a lot for your input My question is - how have you tested your changes? :) I don't have the corresponding tests and while I could add this patch easily I do need some evaluation.

MaskRay commented 6 years ago

These functions are used to validate the correctness of https://github.com/google/dimsum We use llvm libFuzzer (https://github.com/google/dimsum/blob/master/dimsum_fuzz.cc ) for tests.

MaskRay commented 6 years ago

Also, these declarations look verbose to me:

int64x2_t vsubl_s32(int32x2_t a, int32x2_t b); // VSUBL.S32 q0,d0,d0
_NEON2SSE_INLINE int64x2_t vsubl_s32(int32x2_t a, int32x2_t b) // VSUBL.S32 q0,d0,d0

If the first line really needed?

Zvictoria commented 6 years ago

Mega thanks ever! I will merge your commit for sure. As for declarations - I do need them for all compilers for some functions called by other functions in this header (they need to be declared in advance). For all other functions they are NOT required for all compilers. But I do remember one compiler tested expects them to be there - otherwise it gives compilation errors. It happens either in C or C++ mode and I don't remember which compiler it is... but the declarations should be there for it :)

caand commented 6 years ago

Late reply, but:

I think all these additions should not be conditioned by _NEON2SSE_64BIT.

This #define is for x86_64, while the changes translate armv8 intrinsics to any x86 (32bit also). It's the user's responsibility to avoid these intrinsics for armv7, but they can be used when targeting armv8, x86, or x86_64 (last two by including this header).

Best regards, Calin

From: Victoria [mailto:notifications@github.com] Sent: Tuesday, 5 December, 2017 13:33 To: intel/ARM_NEON_2_x86_SSE Cc: Subscribed Subject: Re: [intel/ARM_NEON_2_x86_SSE] Add some aarch64 functions (#10)

Merged #10 https://github.com/intel/ARM_NEON_2_x86_SSE/pull/10 .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/intel/ARM_NEON_2_x86_SSE/pull/10#event-1372532769 , or mute the thread https://github.com/notifications/unsubscribe-auth/ALa5uUb_QMmqmz0Hmgj03ehe83TtCPSkks5s9TgPgaJpZM4QoK1C .Image removed by sender.