Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
As mentioned by @cbornet in #153, the precision of some of the kernels may be lower than expected. It's mostly true for the cosine distance due to low-precision aggregations and, more importantly, due to rsqrt approximations.
This PR refactors a huge part of dot.h and spatial.h to introduce new helper functions for NEON, Haswell, and Skylake, to mitigate those issues.
As mentioned by @cbornet in #153, the precision of some of the kernels may be lower than expected. It's mostly true for the cosine distance due to low-precision aggregations and, more importantly, due to
rsqrt
approximations.This PR refactors a huge part of
dot.h
andspatial.h
to introduce new helper functions for NEON, Haswell, and Skylake, to mitigate those issues.