spotify / voyager

🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
https://spotify.github.io/voyager/
Apache License 2.0
1.26k stars 51 forks source link

Returning Duplicate Neighbors #68

Closed ArmanJR closed 1 month ago

ArmanJR commented 1 month ago

It's weird, but I get completely different results when running the first example of the documentation:

import numpy as np
from voyager import Index, Space

# Create an empty Index object that can store vectors:
index = Index(Space.Euclidean, num_dimensions=5)
id_a = index.add_item([1, 2, 3, 4, 5])
id_b = index.add_item([6, 7, 8, 9, 10])

print(id_a)  # => 0
print(id_b)  # => 1

# Find the two closest elements:
neighbors, distances = index.query([1, 2, 3, 4, 5], k=2)
print(neighbors)  # => [0, 1]
print(distances)  # => [0.0, 125.0]
0
1
[0 0]
[0. 0.]

It seems when querying, a neighbor is being considered more than once. My device is Mac M1Pro, Python 3.12.

miclegr commented 1 month ago

Hi! thanks for reporting this. I was able to reproduce.

We just released version v2.0.7, the issue is now solved. You should get the expected behaviour, as in the documentation. If not feel free to reopen the issue.