Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
For input vectors that are highly co-linear, their cosine distance may be computed to be negative due to numerical error.
Steps to reproduce
import simsimd
import numpy as np
u = np.array([-0.30039746, -0.13594460, 0.58292344])
v = np.array([-0.65563949, -0.29700866, 1.27146813])
print(simsimd.cosine(u, v))
Expected behavior
Cosine distance should be between 0 and 2. But I don't know if returning a negative result is a material issue in practice.
43 describes a similar issue related to numerical inaccuracy of the cosine distance. (Probably due to the use of RSQRT.)
A possible workaround is to clip the result within $[0,2]$, but that may have a negative impact on the performance (which could be more significant for short vectors are increasing negligible for longer vectors).
SimSIMD version
v5.4.3
Operating System
macOS Sonoma
Hardware architecture
Arm
Which interface are you using?
Python bindings
Contact Details
No response
Are you open to being tagged as a contributor?
[ ] I am open to being mentioned in the project .git history as a contributor
Is there an existing issue for this?
[X] I have searched the existing issues
Code of Conduct
[X] I agree to follow this project's Code of Conduct
Describe the bug
For input vectors that are highly co-linear, their cosine distance may be computed to be negative due to numerical error.
Steps to reproduce
Expected behavior
Cosine distance should be between 0 and 2. But I don't know if returning a negative result is a material issue in practice.
43 describes a similar issue related to numerical inaccuracy of the cosine distance. (Probably due to the use of RSQRT.)
A possible workaround is to clip the result within $[0,2]$, but that may have a negative impact on the performance (which could be more significant for short vectors are increasing negligible for longer vectors).
SimSIMD version
v5.4.3
Operating System
macOS Sonoma
Hardware architecture
Arm
Which interface are you using?
Python bindings
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct