ashvardanian / SimSIMD

Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
https://ashvardanian.com/posts/simsimd-faster-scipy/
Apache License 2.0
802 stars 42 forks source link

AMX support for tiled matrix multiplications #26

Open ashvardanian opened 8 months ago

ashvardanian commented 8 months ago

Both Intel and Apple now have specialized AMX tiled matrix multiplication extensions. Both are tricky to use, but may result in substantial performance improvements. Potentially even for single vector dot-products and cosine distances.

Resources:

MarkReedZ commented 1 month ago

I'll try doing the BF16 dot product with this. Shall we add a new function matmul as well for matrix multiplication?

ashvardanian commented 1 month ago

I think it may be a new file called "dots.h", that implements matrix multiplications 🤗