Up to 200x Faster Dot Products & Similarity Metrics â for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 ð
The older bf16 Euclidean Distance implementation had an inefficient implementation for mixed-precision vector subtractions. The new one is very similar, but avoids a couple of serial operations and doubles the throughput:
@rschu1ze added a very handy feature-detection function simsimd_uses_dynamic_dispatch() ð Do we need to expose it to Rust, Python, and other bindings?
@MarkReedZ added missing kernels to the benchmark utility ð
Performance Improvements
The older
bf16
Euclidean Distance implementation had an inefficient implementation for mixed-precision vector subtractions. The new one is very similar, but avoids a couple of serial operations and doubles the throughput:Old Implementation
New Implementation
Other Changes
simsimd_uses_dynamic_dispatch()
ð Do we need to expose it to Rust, Python, and other bindings?