Open gahoo opened 3 years ago
Thanks for the reference! Most of the Faiss code is optimized for AVX rather than SSE (256 bit registers rather than 128). This can probably be emulated with NEON as well but not sure. The main parts of Faiss that are vector optimized are:
I found a new type of scaling vector instruction for ARM, called scalar vector extensions (SVE). Would this be more promising?
And there is another project called SIMDe which could transition SSE/AVX code to NEON.
Summary
I found a project that converts Intel SSE intrinsics to Arm/Aarch64 NEON intrinsics (sse2neon). Would faiss be faster if SSE support added to Arm with sse2neon?
Platform
CPU: aarch64