ashvardanian / SimSIMD

Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
https://ashvardanian.com/posts/simsimd-faster-scipy/
Apache License 2.0
794 stars 42 forks source link

Dev/eknag/bf16 cossim matmul #146

Closed eknag closed 3 days ago

eknag commented 3 days ago

Add benchmark - only bf16 cos implementations that leverages vbfmmlaq_f32 instead of 3 seperate dot products.

ashvardanian commented 3 days ago

Thank you, @eknag!