Closed edwinkys closed 7 months ago
Thank you, @edwinkys! What is the result you are getting?
Thank you for the fast reply!
For manual calculation, I got: 0.99385864
.
For SimSIMD calculation, I got: 0.009096146
.
I run it on Apple M2 chip if that info is relevant.
It returns the cosine similarity, not the cosine distance, so you have to subtract the value from 1 to get the wanted result 🤗
Hmm I'm not quite sure about that. The manual calculation that I provided is the result of cosine similarity not cosine distance.
Sc = x • y / ||x|| ||y||
So, I went through some digging and I found the cosine similarity implementation in the code base: https://github.com/ashvardanian/SimSIMD/blob/18d17686124ddebd9fe55eee56b2e0273a613d4b/include/simsimd/spatial.h#L388-L414
The line 413 seems to imply that SimSIMD cosine implementation returns cosine distance instead of cosine similarity.
But even then, after I substract the SimSIMD result from 1 to obtain the cosine similarity, the result is 0.99090385
which is still a mismatch from the calculation that I got 0.99385864
.
I also double checked my calculation result against this online calculator: https://www.omnicalculator.com/math/cosine-similarity?c=USD&v=trig:0,a0:1,a1:3,a2:5,b0:2,b1:4,b2:6
I'd love to learn more about why this happen and I'm willing to help if needed 😁
@edwinkys the difference between 0.99090385
and 0.99385864
is due to the numerics error, likely coming from the vrsqrte_f32
operation, that approximates the reciprocal square root. You may get noticeably different results depending on the 1 / sqrt(x)
implementation.
Oh I see. If it's an approximate square root calculation, it makes sense. Thank you for clarifying it!
All floating point operations are approximate, but different libraries will have different accuracy/speed/complexity tradeoffs 🤗
First of all, thank you for creating and maintaining this project. It helped a lot for my SIMD implementation on distance functions for vectors.
I encounter some oddity when it comes to using
f32::cosine
in Rust. When comparing it to the manual cosine similarity calculation, it produces different result.I'm just curious, is there something that I miss from the implementation?
Note: The
f32::dot
andf32::sqeuclidean
do produced the correct result compared to manual calculation.