Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
The new f64 kernels don't benefit much from NEON, given the small 128-bit register size, but instead leverage the rsqrt approximations already used in SimSIMD for lower-precision inputs.
The new
f64
kernels don't benefit much from NEON, given the small 128-bit register size, but instead leverage thersqrt
approximations already used in SimSIMD for lower-precision inputs.