ashvardanian / SimSIMD

Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
https://ashvardanian.com/posts/simsimd-faster-scipy/
Apache License 2.0
988 stars 59 forks source link

Bug: Mahalanobis distance returns squared distance #187

Closed guyrosin closed 1 month ago

guyrosin commented 2 months ago

Describe the bug

The Mahalanobis distance method currently calculates the squared distance. Maybe this was intentional to get rid of the sqrt operation (I also saw it's tested as a squared distance), but it's quite misleading. An alternative may be renaming the method to sqmahalanobis().

btw the same seems to apply also to jensenshannon().

Steps to reproduce

import numpy as np
import simsimd
from scipy.spatial import distance

u, v = np.array([2, 0, 0], dtype=np.float32), np.array([0, 1, 0], dtype=np.float32)
iv = np.array([[1, 0.5, 0.5], [0.5, 1, 0.5], [0.5, 0.5, 1]], dtype=np.float32)
print(distance.mahalanobis(u, v, iv))  # 1.732
print(simsimd.mahalanobis(u, v, iv))  # 3.0

Expected behavior

mahalanobis() should return the distance, not the squared distance.

SimSIMD version

v5.3.0

Operating System

Ubuntu 22.04.4 LTS

Hardware architecture

x86

Which interface are you using?

Python bindings

Contact Details

guy.rosin@gmail.com

Are you open to being tagged as a contributor?

Is there an existing issue for this?

Code of Conduct